about 4 hours ago
Base Salary
$345k - $399k/yr
Responsibilities
- Serve as the GPU technical leader for the Compute team.
- Own the GPU host lifecycle above raw fleet management.
- Architect how GPU capacity is exposed to compute platforms.
- Drive GPU reliability and performance at fleet scale.
- Evaluate and onboard new GPU and AI accelerator platforms.
- Establish standards, tooling, and APIs for GPU compute consumption.
Requirements
- 10+ years of experience building and operating large-scale distributed systems.
- Deep, hands-on GPU expertise at the machine management layer and above.
- A track record as an expert for compute infrastructure.
- Strong proficiency in Go or other well-structured programming languages.
- Experience operating GPU and AI workloads in production.
- Familiarity with Kubernetes for GPU workloads is a strong plus.
- A history of being the anchor expert for GPU and compute problems.