Latitude AI

Senior Site Reliability Engineer

Latitude AI

Apply
6 days ago
Palo Alto, CA, USA or Pittsburgh, PA, USA
Senior / Mid Level
H1B Sponsor

Base Salary

$179k - $269k/yr

Responsibilities

  • Build monitoring to ensure platform health and reliability.
  • Create alerting systems and runbooks for faster issue detection and remediation.
  • Debug complex issues across multiple components and implement fixes.
  • Participate in on-call rotation and conduct blameless postmortems.
  • Design and implement platform components to enhance customer efficiency.
  • Develop Kubernetes controllers to automate operations.

Requirements

  • Bachelor's degree in a relevant field and 4+ years of experience, or a Master's degree with 2+ years, or a PhD.
  • Fundamental understanding of Linux internals, TCP/IP networking, and storage subsystems.
  • Hands-on development experience in Go or Python for production software.
  • Strong experience in scaling and securing cloud services (AWS, GCP).
  • Experience with infrastructure-as-code tools like Terraform or CloudFormation.
  • Proficient in authoring and maintaining Kubernetes Controllers in Go.
  • Experience running Kubernetes in large-scale production environments.
  • Familiarity with metrics, logging, and tracing systems.
  • Ability to guide teams on engineering design limitations and performance scaling.
  • Focus on increasing service reliability through SLOs.
  • Strong communication skills for effective teamwork.

Benefits

  • Competitive compensation packages.
  • High-quality medical, dental, and vision insurance.
  • Health savings account with employer match.
  • Employer-matched 401(k) retirement plan with immediate vesting.
  • Paid parental and medical leave.
  • Unlimited vacation and 15 paid holidays.
  • Daily lunches, snacks, and beverages in office locations.
  • Pre-tax spending accounts for healthcare and dependent care.
  • Monthly wellness stipend.
  • Adoption/Surrogacy support program.
  • Backup child and elder care program.
  • Professional development reimbursement.
  • Employee assistance program.
  • Discounted programs for legal services, identity theft protection, and more.
  • Company bonding activities and wellness initiatives.

Tech Stack

AWSElasticsearchGoGoogle Cloud PlatformKubernetesLinuxPrometheusPythonTerraform

Categories

AI & MLData EngineeringDevOpsSecurity