Software Engineer, Infrastructure Reliability
OpenAI
23 days ago
San Francisco, CA, USA
Mid Level / Senior
Base Salary
$255k - $385k/yr
Responsibilities
- Design, build, and operate reliable and performant systems used across engineering.
- Identify and fix performance bottlenecks and inefficiencies.
- Resolve complex issues through deep investigation.
- Continuously improve automation to reduce manual work.
- Contribute to incident response and postmortems.
- Develop best practices around system reliability and scalability.
Requirements
- 4+ years of relevant industry experience.
- 2+ years leading large scale, complex projects or teams.
- Deep understanding of distributed systems principles.
- Experience with orchestration systems like Kubernetes.
- Proficiency in cloud infrastructure and IaC tools like Terraform.
- Strong problem-solving skills in fast-paced environments.
- Knowledge of security best practices in cloud environments.
Tech Stack
AWSAzureDatadogGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheusSplunkTerraform
Categories
AI & MLData EngineeringDevOpsSecurity