about 2 hours ago
Remote, United StatesSenior / Staff+
H1B Sponsor
Base Salary
$159k - $250k/yr
Responsibilities
- Build and tune production LLM serving to maximize throughput and minimize latency.
- Instrument and profile training runs to identify and resolve bottlenecks.
- Apply knowledge of GPU architecture to optimize model performance.
- Deploy and operate multiple models within shared GPU clusters on GKE.
- Measure and improve GPU utilization to enhance throughput-per-dollar.
- Collaborate with clients to understand and implement performance and cost requirements.
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in ML/AI engineering with a focus on performance and infrastructure.
- Proven experience in deploying and optimizing models in production environments.
- Demonstrated ability to profile and improve GPU utilization.
- Experience with Classic Machine Learning is a strong plus.
- Knowledge of Data Engineering and SQL.
Benefits
- Comprehensive Health Insurance.
- Paid Leave (Vacation/PTO).
- Paid Holidays.
- Sick Leave.
- Parental Leave.
- Bereavement Leave.
- 401 (k) Employer Match.
- Employee Referral Bonuses.
Tech Stack
Categories
AI & MLData Engineering
