4 months ago
Base Salary
$200k - $250k/yr
Responsibilities
- Define and evolve reliability standards, SLIs, SLOs, and error budgets.
- Improve observability, alerting, and incident processes across services.
- Lead high-severity incidents and drive clear, actionable follow-ups.
- Partner with engineering teams to design resilient, scalable systems.
- Build automation to reduce toil and lower operational risk.
- Mentor engineers and influence best practices across teams.
Requirements
- 10+ years in SRE, infrastructure, or backend engineering roles.
- Strong software engineering experience in one or more modern languages.
- Expertise operating distributed systems in production at scale.
- Deep experience with AWS, observability tooling, and CI/CD systems.
- Comfortable navigating ambiguity and setting direction in a fast-moving environment.
Benefits
- Competitive compensation and equity.
- Unlimited PTO.
- Up to 100% employer covered monthly healthcare premium (medical, dental, vision).
- Lunch provided via Sharebite, plus dinner for any later office days.
- Parental leave up to 12 weeks.
- Tax free commuter and parking benefits.
- Voluntary insurances (Life, Hospital, Critical Illness, Accident).
- Employee Assistance Program (Rightway).
- Free One Medical Membership.
- 401k.
