Senior Site Reliability Engineer
Latitude AI
6 days ago
Palo Alto, CA, USA or Pittsburgh, PA, USA
Senior / Mid Level
H1B Sponsor
Base Salary
$179k - $269k/yr
Responsibilities
- Build monitoring to ensure platform health and reliability.
- Create alerting systems and runbooks for faster issue detection and remediation.
- Debug complex issues across multiple components and implement fixes.
- Participate in on-call rotation and conduct blameless postmortems.
- Design and implement platform components to enhance customer efficiency.
- Develop Kubernetes controllers to automate operations.
Requirements
- Bachelor's degree in a relevant field and 4+ years of experience, or a Master's degree with 2+ years, or a PhD.
- Fundamental understanding of Linux internals, TCP/IP networking, and storage subsystems.
- Hands-on development experience in Go or Python for production software.
- Strong experience in scaling and securing cloud services (AWS, GCP).
- Experience with infrastructure-as-code tools like Terraform or CloudFormation.
- Proficient in authoring and maintaining Kubernetes Controllers in Go.
- Experience running Kubernetes in large-scale production environments.
- Familiarity with metrics, logging, and tracing systems.
- Ability to guide teams on engineering design limitations and performance scaling.
- Focus on increasing service reliability through SLOs.
- Strong communication skills for effective teamwork.
Benefits
- Competitive compensation packages.
- High-quality medical, dental, and vision insurance.
- Health savings account with employer match.
- Employer-matched 401(k) retirement plan with immediate vesting.
- Paid parental and medical leave.
- Unlimited vacation and 15 paid holidays.
- Daily lunches, snacks, and beverages in office locations.
- Pre-tax spending accounts for healthcare and dependent care.
- Monthly wellness stipend.
- Adoption/Surrogacy support program.
- Backup child and elder care program.
- Professional development reimbursement.
- Employee assistance program.
- Discounted programs for legal services, identity theft protection, and more.
- Company bonding activities and wellness initiatives.
Tech Stack
AWSElasticsearchGoGoogle Cloud PlatformKubernetesLinuxPrometheusPythonTerraform
Categories
AI & MLData EngineeringDevOpsSecurity