
Principal ML Ops Engineer
Pragmatike17 days ago
Cambridge, MA, USAStaff+ / Senior
Responsibilities
- Architect, build, and scale the end-to-end ML Ops pipeline.
- Design reliable infrastructure for model deployment and orchestration.
- Optimize compute usage across distributed systems.
- Lead the implementation of observability for ML systems.
- Build automated workflows for dataset curation and CI/CD for ML models.
- Collaborate with researchers to productionize models.
- Establish ML Ops best practices and internal standards.
- Mentor engineers and influence architectural direction.
Requirements
- Deep hands-on experience designing and operating production ML systems at scale.
- Strong background in ML Ops, distributed systems, and cloud infrastructure.
- Proficiency with Python and familiarity with TypeScript or Go.
- Expertise in ML frameworks like PyTorch and CUDA.
- Strong experience with containerization and orchestration tools.
- Deep understanding of ML lifecycle workflows.
- Ability to lead technical strategy and collaborate cross-functionally.
Benefits
- Competitive salary and equity options.
- Sign-on bonus.
- Health, Dental, and Vision insurance.
- 401k plan.