5 days ago
Bellevue, WA, USA or Sunnyvale, CA, USAStaff+
Base Salary
$206k - $333k/yr
Responsibilities
- Define the multi-year benchmarking strategy and roadmap.
- Lead end-to-end MLPerf Inference and Training submissions.
- Design a Kubernetes-native benchmarking service for CoreWeave stacks.
- Build CI/CD pipelines and K8s controllers/operators for benchmarking.
- Partner with NVIDIA and key ISVs for optimizations and improvements.
Requirements
- 10+ years of experience in building distributed systems or HPC/cloud services.
- Proven track record of architecting planet-scale data systems.
- Deep understanding of GPU performance and model-server stacks.
- Proficient with Kubernetes and ML control planes.
- Excellent communication skills for interfacing with various stakeholders.
Benefits
- 100% paid medical, dental, and vision insurance.
- Company-paid life insurance and short/long-term disability insurance.
- Tuition reimbursement and participation in Employee Stock Purchase Program.
- Flexible PTO and paid parental leave.
- Catered lunch each day and a casual work environment.
Tech Stack
GrafanaKubernetesPrometheusPyTorch
Categories
AI & MLData Engineering
