about 2 hours ago
Responsibilities
- Lead reliability initiatives across multiple Ads domains including ad serving and reporting.
- Partner with engineering leadership to improve reliability and operational excellence.
- Drive architecture reviews and influence technical decisions for revenue-generating systems.
- Design and build platforms and automation to enhance reliability and developer productivity.
- Participate in on-call rotations and lead incident investigations during major production events.
- Identify systemic reliability risks and implement long-term solutions.
- Establish reliability metrics for advertiser-critical user journeys.
- Mentor engineers and provide technical leadership across teams.
- Influence roadmap planning to incorporate reliability considerations.
Requirements
- 8+ years of experience in Site Reliability Engineering or related roles.
- Strong experience supporting high traffic, user-facing production environments.
- Deep understanding of distributed systems, networking, and cloud native architectures.
- Experience designing highly available systems with strong operational practices.
- Strong understanding of observability systems including metrics and alerting.
- Good programming skills in languages such as Go or Python.
- Experience improving reliability through SLOs and incident management.
- Demonstrated ability to troubleshoot complex issues in distributed systems.
- Strong collaboration and communication skills.
Benefits
- Global Benefit programs that fit your lifestyle, including workspace and professional development support.
- Family Planning Support.
- Gender-Affirming Care.
- Mental Health & Coaching Benefits.
- Group Personal Pension Scheme with Employer match.
- Private Medical and Dental Scheme.
- Income Replacement Programs.
- Bike to Work scheme.
- Flexible Vacation & Paid Volunteer Time Off.
- Generous Paid Parental Leave.
