4 days ago
Bellevue, WA, USA or Sunnyvale, CA, USAStaff+
Base Salary
$188k - $275k/yr
Responsibilities
- Lead architecture and performance initiatives across multiple services.
- Optimize inference performance focusing on latency, throughput, and GPU utilization.
- Improve system reliability for real-time inference systems.
- Drive cross-team design initiatives and influence engineering direction.
- Work on scheduling, batching, and memory optimization in distributed systems.
Requirements
- 8–12+ years of experience in large-scale distributed systems or cloud platforms.
- Proven experience leading cross-team technical initiatives.
- Strong programming skills in Go, Python, or C++.
- Deep expertise in Kubernetes at production scale.
- Strong understanding of distributed systems and performance optimization.
- Experience with low-latency, high-throughput systems.
- Hands-on experience with inference systems and optimization strategies.
- Familiarity with mixed precision and streaming inference workloads.
Benefits
- 100% paid medical, dental, and vision insurance.
- Company-paid life insurance and short/long-term disability insurance.
- Flexible Spending Account and Health Savings Account.
- Tuition reimbursement and employee stock purchase program.
- Mental wellness benefits and family-forming support.
- Paid parental leave and flexible childcare support.
- 401(k) with generous employer match and flexible PTO.
- Catered lunch in office locations and a casual work environment.
