Staff Machine Learning Engineer – Autonomous Driving Model Quantization & Deployment

27 days ago

Santa Clara, CA, USAStaff+

Base Salary

$215k - $364k/yr

Responsibilities

Own the end-to-end quantization and optimization roadmap for large-scale multimodal models.
Apply and innovate in Post-Training Quantization, Quantization-Aware Training, and pruning techniques.
Collaborate with model researchers to ensure architectures are deployment-friendly.
Develop and maintain robust, safety-critical deployment stacks in Modern C++.

5-8 years of experience in model deployment, quantization, or high-performance computing.
Mastery of Modern C++ and deep experience with CUDA or other hardware acceleration libraries.
Strong familiarity with PyTorch and knowledge of inference engines like TensorRT, ONNX Runtime, or TVM.
Hands-on experience with INT8/FP8/INT4 quantization and knowledge of challenges in quantizing Large Language Models.
Solid understanding of computer architecture and experience with embedded/edge compute constraints.
Ability to debug complex performance bottlenecks across the entire software stack.

A fun, supportive and engaging environment.
Infrastructures and computational resources to support your ML model development/research.
Opportunity to work on cutting edge technologies with the top talent in the field.
Opportunity to make significant impact on transportation revolution by advancing autonomous driving.
Competitive compensation package.
Snacks, lunches, dinners, and fun activities.

PyTorch