2 months ago
Boston, MA, USA +14 moreSenior
Base Salary
$225k - $315k/yr
Responsibilities
- Architect and optimize distributed training and inference systems for large-scale AI models.
- Design and deliver customer-focused solutions that maximize performance and business value.
- Lead the transition of ML pipelines from POC to scalable production systems.
- Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals.
- Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices.
- Provide technical leadership and mentor teams on AI infrastructure and deployment strategies.
- Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps.
Requirements
- 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles.
- Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments.
- Demonstrated success delivering ML products, scaling from POC to production.
- Deep knowledge of ML frameworks like PyTorch and JAX.
- Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband).
- Exceptional communication skills to engage both technical teams and business stakeholders.
- Legal authorization to work in the United States on a full-time basis without sponsorship.
Benefits
- Competitive compensation: $225,000 to $315,000 per year (negotiable based on experience and location).
- Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families.
- 401(k) plan with a 4% match program.
- Stock options plan.
- Flexible remote work environment.
- Company-paid short-term, long-term disability, and life insurance coverage.
- 20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers.
- Up to $85/month for mobile and internet.
- Work with state-of-the-art AI and cloud technologies, including the latest NVIDIA GPUs.
- Contribute to sustainable AI infrastructure with energy-efficient data centers.
