Lead Engineer, Inference Platform
MongoDB
5 months ago
Palo Alto, CA, USA
Senior / Staff+
H1B Sponsor
Base Salary
$137k - $270k/yr
Responsibilities
- Partner with AI engineers to productionize embedding models and rerankers for batch and real-time inference.
- Lead projects focused on performance optimization, GPU utilization, autoscaling, and observability.
- Design and build components of a multi-tenant inference service integrated with Atlas Vector Search.
- Contribute to platform features like model versioning, safe deployment pipelines, and model health monitoring.
- Collaborate with cross-functional teams to define architectural patterns supporting high availability and low latency.
- Guide decisions on model serving architecture using tools like vLLM and Kubernetes.
- Provide technical leadership and mentorship to junior engineers.
Requirements
- 8+ years of engineering experience in backend systems, ML infrastructure, or scalable platform development.
- Expertise in serving embedding models in production environments.
- Strong systems skills in languages like Go, Rust, C++, or Python.
- Experience with cloud-native distributed systems focusing on latency and availability.
- Familiarity with inference runtimes and vector search systems.
- Proven ability to collaborate across disciplines and experience levels.
- Experience with high-scale SaaS infrastructure in multi-tenant environments.
- 1+ years of experience as a technical lead for a large-scale ML inference or training platform.
Benefits
- Competitive compensation and equity.
- Career growth in a hands-on technical leadership role.
- Flexible paid time off and 20 weeks fully-paid gender-neutral parental leave.
- Fertility and adoption assistance.
- 401(k) plan and mental health counseling.
- Access to transgender-inclusive health insurance coverage.
Tech Stack
AWSAzureC++GoGoogle CloudKubernetesMongoDBPythonRust
Categories
AI & MLBackend