7 days ago
Base Salary
$260k - $300k/yr
Responsibilities
- Define and own SLOs, SLIs, and error budgets for Devin and Windsurf.
- Build monitoring, alerting, and observability systems for service health.
- Lead incident response and conduct blameless postmortems.
- Own deployment pipelines and internal developer tooling.
- Manage cloud infrastructure through code to ensure scalability.
- Model growth and forecast resource needs for infrastructure.
- Integrate security as a core reliability requirement.
- Collaborate with teams to build reliability into product development.
Requirements
- Deep experience running production systems at scale.
- Strong software engineering fundamentals.
- Proficiency with cloud infrastructure (AWS, GCP, or Azure).
- Experience with container orchestration (Kubernetes).
- Familiarity with infrastructure as code (Terraform or equivalent).
- Experience building and owning CI/CD pipelines.
- Strong observability instincts for system instrumentation.
- Comfort owning incidents end to end.
Benefits
- Base salary of $260,000 - $300,000 plus significant early-stage equity.
- Fully paid medical, dental, and vision for you and your dependents.
- 401(k) with company match.
- Perks include a private chef, cozy slippers, and endless snacks.
