Staff Technical Lead for Inference & ML Performance

6 days ago

San Francisco, CA, USA Staff+

Responsibilities

Set technical direction for high-performance inference solutions.
Contribute to critical inference performance enhancements and optimizations.
Collaborate with research and applied ML teams to influence inference strategies.
Drive advanced performance optimizations including model parallelism and kernel optimization.
Mentor and expand a team of performance-focused engineers.

Requirements

Deep experience in ML performance optimization for large-scale generative models.
Understanding of the full ML performance stack including PyTorch and TensorRT.
Expert-level familiarity with advanced inference techniques like quantization and distributed serving.
Hands-on leadership experience in solving complex performance challenges.
Ability to thrive in cross-functional collaboration with ML teams and stakeholders.

Benefits

High impact role in a rapidly growing company with a world-changing vision.
Opportunity to set new standards for inference performance.

Tech Stack

PyTorch

Categories

AI & MLData Engineering