about 3 hours ago
Base Salary
$182k - $242k/yr
Responsibilities
- Design, build, and operate Go-based services for GPU data center infrastructure.
- Automate data center bring-up, hardware discovery, and health monitoring.
- Develop APIs and services for managing server health and firmware state.
- Enhance observability and operational tooling for quick issue resolution.
- Translate hardware failures into software improvements for resilience.
- Collaborate with infrastructure and operations teams for fleet-scale safety.
Requirements
- 5+ years of experience in building and operating infrastructure or backend systems.
- Bachelor’s or Master’s degree in Computer Science or related field, or equivalent experience.
- Strong proficiency in Go for production services and tools.
- Experience designing and building gRPC and REST APIs.
- Familiarity with Kubernetes and containerized workloads in production.
- Knowledge of observability tools like Prometheus and Grafana.
Benefits
- 100% paid medical, dental, and vision insurance.
- Company-paid life insurance and short/long-term disability insurance.
- Flexible Spending Account and Health Savings Account.
- Tuition reimbursement and employee stock purchase program.
- Mental wellness benefits and family-forming support.
- Flexible PTO and catered lunch in office locations.
