Site Reliability Engineer
Zoox
about 1 month ago
Foster City, CA, USA
Senior
H1B Sponsor
Responsibilities
- Design and implement highly scalable and reliable systems for the autonomous vehicle platform.
- Optimize system performance, reliability, and scalability.
- Develop and maintain monitoring, alerting, and reporting systems.
- Collaborate with software engineering teams to improve software architecture and deployment processes.
- Conduct root cause analysis of production issues and implement corrective actions.
- Implement disaster recovery and business continuity plans.
Requirements
- 5+ years of experience in site reliability engineering or a similar role.
- Strong background in working with large-scale distributed systems.
- Proven experience with cloud platforms such as AWS, GCP, or Azure.
- Expertise in container orchestration technologies like Kubernetes.
- Deep understanding of networking, storage, and database technologies.
- Strong programming skills in languages such as Python, Go, C/C++, or Java.
- Experience with infrastructure as code tools such as Terraform, Ansible, Salt, or CloudFormation.
Tech Stack
AnsibleAWSAzureC++GoGoogle Cloud PlatformJavaKubernetesPythonTerraform
Categories
AI & MLData EngineeringDevOpsSecurity