2 days ago
San Francisco, CA, USA
Mid Level
Base Salary
$295k - $555k/yr
Responsibilities
- Build and refine performance models translating microbenchmark results into cost-to-serve estimates.
- Analyze inference workloads end to end across applications, models, and fleet infrastructure.
- Enhance tooling to identify bottlenecks across layers for latency and throughput.
- Collaborate with other teams to turn performance insights into concrete improvements.
Requirements
- Enjoy reasoning from first principles about distributed systems, model inference, and hardware efficiency.
- Comfortable working across abstraction layers, from application behavior to kernels, accelerators, networking, and fleet scheduling.
- Deep expertise with performance profiling, benchmarking, analysis, and optimization.
- Enjoy collaborating with engineering and research teams to improve real production systems.
Categories
AI & MLData Engineering