5 days ago
Bellevue, WA, USA or Sunnyvale, CA, USAStaff+
Base Salary
$206k - $303k/yr
Responsibilities
- Define the long-term architecture for orchestration platforms across Kubernetes, Slurm, and related systems.
- Act as a technical authority on scheduling, quota enforcement, and multi-tenant GPU isolation.
- Lead the evolution of Kubernetes-native control planes and design systems for workload admission and validation.
- Set standards for reliability and operational readiness across orchestration services.
- Write and review production code for Kubernetes controllers and schedulers.
- Mentor senior engineers and influence technical direction across teams.
Requirements
- 15+ years of experience building and operating large-scale distributed systems.
- Deep knowledge of Kubernetes and Slurm internals.
- Experience with GPU-heavy platforms for AI training and inference.
- Strong background in Go and cloud-native systems development.
- Proven ability to set technical direction without direct authority.
- Bachelor’s or Master’s degree in a relevant field or equivalent experience.
Benefits
- 100% paid medical, dental, and vision insurance.
- Company-paid life insurance and short/long-term disability insurance.
- Flexible Spending Account and Health Savings Account.
- Tuition reimbursement and participation in Employee Stock Purchase Program.
- Mental wellness benefits and family-forming support.
- Flexible PTO and catered lunch each day.
