about 3 hours ago
Responsibilities
- Lead reliability engineering for user experience by driving operational excellence.
- Architect systems for scalability and high availability under global load.
- Identify and mitigate systemic risks and reliability bottlenecks.
- Automate operational tasks to improve deployment safety and incident response.
- Manage complex incident responses and ensure long-term fixes are implemented.
- Define best practices for reliability engineering and operational maturity.
- Mentor engineers and shape the reliability culture across the organization.
Requirements
- 8+ years of experience in Site Reliability Engineering or related roles.
- Strong collaboration and communication skills to influence technical direction.
- Experience supporting high traffic, user-facing production environments.
- Deep understanding of distributed systems, networking, and Linux systems.
- Experience designing highly available systems with strong reliability practices.
- Strong programming skills in languages such as Go or Python.
- Understanding of observability systems including metrics and alerting.
- Ability to troubleshoot complex issues across applications and infrastructure.
Benefits
- Global benefit programs that fit your lifestyle.
- Family planning support.
- Gender-affirming care.
- Mental health and coaching benefits.
- Group personal pension scheme with employer match.
- Private medical and dental scheme.
- Income replacement programs.
- Bike to work scheme.
- Flexible vacation and paid volunteer time off.
- Generous paid parental leave.
