GrepJob
XPENG

Staff Machine Learning Engineer – Autonomous Driving Model Quantization & Deployment

XPENG
Apply
27 days ago
Santa Clara, CA, USAStaff+

Base Salary

$215k - $364k/yr

Responsibilities

  • Own the end-to-end quantization and optimization roadmap for large-scale multimodal models.
  • Apply and innovate in Post-Training Quantization, Quantization-Aware Training, and pruning techniques.
  • Collaborate with model researchers to ensure architectures are deployment-friendly.
  • Develop and maintain robust, safety-critical deployment stacks in Modern C++.

Requirements

  • 5-8 years of experience in model deployment, quantization, or high-performance computing.
  • Mastery of Modern C++ and deep experience with CUDA or other hardware acceleration libraries.
  • Strong familiarity with PyTorch and knowledge of inference engines like TensorRT, ONNX Runtime, or TVM.
  • Hands-on experience with INT8/FP8/INT4 quantization and knowledge of challenges in quantizing Large Language Models.
  • Solid understanding of computer architecture and experience with embedded/edge compute constraints.
  • Ability to debug complex performance bottlenecks across the entire software stack.

Benefits

  • A fun, supportive and engaging environment.
  • Infrastructures and computational resources to support your ML model development/research.
  • Opportunity to work on cutting edge technologies with the top talent in the field.
  • Opportunity to make significant impact on transportation revolution by advancing autonomous driving.
  • Competitive compensation package.
  • Snacks, lunches, dinners, and fun activities.

Tech Stack

PyTorch

Categories

AI & MLData ScienceEmbedded