Lead Machine Learning Engineer, Inference & Performance

about 2 hours ago

Remote, United StatesSenior / Staff+

H1B Sponsor

Base Salary

$159k - $250k/yr

Responsibilities

Build and tune production LLM serving to maximize throughput and minimize latency.
Instrument and profile training runs to identify and resolve bottlenecks.
Apply knowledge of GPU architecture to optimize model performance.
Deploy and operate multiple models within shared GPU clusters on GKE.
Measure and improve GPU utilization to enhance throughput-per-dollar.
Collaborate with clients to understand and implement performance and cost requirements.

Requirements

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
5+ years of experience in ML/AI engineering with a focus on performance and infrastructure.
Proven experience in deploying and optimizing models in production environments.
Demonstrated ability to profile and improve GPU utilization.
Experience with Classic Machine Learning is a strong plus.
Knowledge of Data Engineering and SQL.

Benefits

Comprehensive Health Insurance.
Paid Leave (Vacation/PTO).
Paid Holidays.
Sick Leave.
Parental Leave.
Bereavement Leave.
401 (k) Employer Match.
Employee Referral Bonuses.

Tech Stack

Google CloudKubernetes Python SQL

Categories

AI & MLData Engineering