GrepJob
fal

Staff Technical Lead for Inference & ML Performance

fal
Apply
6 days ago

Responsibilities

  • Set technical direction for high-performance inference solutions.
  • Contribute to critical inference performance enhancements and optimizations.
  • Collaborate with research and applied ML teams to influence inference strategies.
  • Drive advanced performance optimizations including model parallelism and kernel optimization.
  • Mentor and expand a team of performance-focused engineers.

Requirements

  • Deep experience in ML performance optimization for large-scale generative models.
  • Understanding of the full ML performance stack including PyTorch and TensorRT.
  • Expert-level familiarity with advanced inference techniques like quantization and distributed serving.
  • Hands-on leadership experience in solving complex performance challenges.
  • Ability to thrive in cross-functional collaboration with ML teams and stakeholders.

Benefits

  • High impact role in a rapidly growing company with a world-changing vision.
  • Opportunity to set new standards for inference performance.

Tech Stack

PyTorch

Categories

AI & MLData Engineering