about 3 hours ago
Base Salary
$243k - $295k/yr
Responsibilities
- Design and develop systems that promote fault-tolerance and resilience.
- Promote and institute reliability best practices across the Infra Compute group.
- Build, automate, and standardize process automation for tooling and platform support.
- Create tooling that provides production guardrails through load testing.
- Develop performance monitoring services to understand capacity issues.
- Analyze systems and designs for production readiness.
Requirements
- Bachelor's degree in Computer Science or related field with at least 6 years of experience as an SRE or Software Engineer.
- Fluency in high-level programming languages such as Go, Java, or C#.
- Experience with Kubernetes or similar orchestration systems.
- Good habits around building software and tools with a focus on reliability.