GrepJob
Lambda

Senior Site Reliability Engineer - Observability

Lambda
Apply
about 8 hours ago
San Francisco, CA, USA or San Jose, CA, USASenior / Staff+
H1B Sponsor

Base Salary

$240k - $401k/yr

Responsibilities

  • Deploy and operate observability platforms for logging, metrics, and distributed tracing.
  • Automate the deployment and operation of observability systems.
  • Set up monitoring for modern AI/HPC cluster infrastructure.
  • Develop platform software to enhance observability and product reliability.
  • Lead engineering teams in developing solutions for their monitoring challenges.

Requirements

  • 8+ years of experience in software engineering, with 3+ years in Go.
  • 5+ years of experience in Site Reliability Engineering practices.
  • Proven understanding of observability tools and practices.
  • Experience with application deployment and monitoring using Kubernetes.
  • Strong experience with modern DevOps practices.
  • Expect quality and reliability from the solutions you build.
  • Enjoy collaborating across team boundaries.

Benefits

  • Generous cash and equity compensation.
  • Health, dental, and vision coverage for you and your dependents.
  • Wellness and commuter stipends for select roles.
  • 401k Plan with 2% company match for USA employees.
  • Flexible paid time off plan.