25 days ago
New York, NY, USAMid Level / Senior
Base Salary
$175k - $275k/yr
Responsibilities
- Build pipelines for large-scale LLM training and fine-tuning.
- Optimize training and inference performance.
- Implement distributed systems for scalable experimentation.
- Improve latency, throughput, and cost efficiency for multimodal inputs.
- Support deployment of LLM systems into production environments.
- Develop tools for evaluation, monitoring, and iteration.
Requirements
- 2+ years of professional industry experience.
- Strong experience with large-scale ML systems.
- Proficiency in PyTorch, CUDA, Triton, and distributed training.
- Experience optimizing LLM training or inference.
- Strong systems intuition regarding memory, compute, and throughput tradeoffs.
- Ability to bridge research and production.
- Strong software engineering and debugging skills.
Benefits
- Comprehensive medical, dental, and vision plans.
- 401K with employer match.
- Commuter Benefits.
- Catered lunch multiple days per week.
- Dinner stipend for late work hours.
- Grubhub subscription.
- Health & Wellness Perks.
- Multiple team offsites per year with monthly team events.
- Generous PTO policy.
Tech Stack
PyTorch
Categories
AI & MLData Engineering
