Staff Software Engineer, Inference

about 2 months ago

Bellevue, WA, USA or Sunnyvale, CA, USAStaff+

H1B Sponsor

Base Salary

$188k - $275k/yr

Responsibilities

Lead architecture and performance initiatives across multiple services.
Optimize inference performance focusing on latency, throughput, and GPU utilization.
Improve system reliability for real-time inference systems.
Drive cross-team design initiatives and influence engineering direction.
Work on scheduling, batching, and memory optimization in distributed systems.

8–12+ years of experience in large-scale distributed systems or cloud platforms.
Proven experience leading cross-team technical initiatives.
Strong programming skills in Go, Python, or C++.
Deep expertise in Kubernetes at production scale.
Strong understanding of distributed systems and performance optimization.
Experience with low-latency, high-throughput systems.
Hands-on experience with inference systems and optimization strategies.
Familiarity with mixed precision and streaming inference workloads.