6 days ago
Responsibilities
- Set technical direction for high-performance inference solutions.
- Contribute to critical inference performance enhancements and optimizations.
- Collaborate with research and applied ML teams to influence inference strategies.
- Drive advanced performance optimizations including model parallelism and kernel optimization.
- Mentor and expand a team of performance-focused engineers.
Requirements
- Deep experience in ML performance optimization for large-scale generative models.
- Understanding of the full ML performance stack including PyTorch and TensorRT.
- Expert-level familiarity with advanced inference techniques like quantization and distributed serving.
- Hands-on leadership experience in solving complex performance challenges.
- Ability to thrive in cross-functional collaboration with ML teams and stakeholders.
Benefits
- High impact role in a rapidly growing company with a world-changing vision.
- Opportunity to set new standards for inference performance.
Tech Stack
PyTorch
Categories
AI & MLData Engineering
