Staff Machine Learning Engineer, AI Serving

about 4 hours ago

Remote, United StatesStaff+

H1B Sponsor

Base Salary

$253k - $355k/yr

Responsibilities

Lead the design, implementation, and maintenance of a GPU-based model serving system.
Develop ML and Generative AI systems in cloud environments on Kubernetes.
Rapidly prototype and create a high-performance feature hydration and processing system.
Establish a unified GPU model export framework for optimized inference models.
Implement real-time ML observability to track feature and model performance.
Work with LLM serving online at scale.
Build an end-to-end inference performance benchmarking framework.
Understand multi-cluster compute environments and network topology for ML inference.

7+ years of experience in ML Engineering, AI Platform Engineering, or Cloud AI Deployment roles.
Experience operating orchestration systems like Kubernetes at scale.
Deep knowledge of cloud technologies for ML platforms, including AWS and Google Cloud.
Proficiency in programming languages and frameworks such as Go and Python.
Excellent communication skills for articulating technical concepts to non-technical stakeholders.
Strong focus on scalability, reliability, performance, and user experience.
Knowledge of model serving, inference pipelines, and observability for AI systems is a plus.
Strong proficiency in Python and experience with AI/ML frameworks like Triton and Pytorch.

AWSGoGoogle CloudKubernetesPythonPyTorchTerraform