Senior Software Engineer, ML Infrastructure

about 1 month ago

San Francisco, CA, USA Senior

H1B Sponsor

Base Salary

$200k - $400k/yr

Responsibilities

Design and build distributed training platforms for LLM and multimodal fine-tuning.
Integrate state-of-the-art training algorithms into production pipelines.
Own inference architecture and multi-provider routing, including failover and optimization.
Lead initiatives to improve latency and cost efficiency across the training and serving stack.
Build evaluation and experimentation infrastructure for rapid iteration.
Drive technical direction, mentor engineers, and establish best practices for ML infrastructure.

Requirements

6+ years building ML infrastructure or production systems at scale.
Deep experience with distributed training, including multi-node GPU clusters.
Strong understanding of LLM inference, latency optimization, and serving architecture.
Proven track record leading complex, multi-quarter technical projects.

Benefits

Take what you need vacation policy.
Medical, Dental, and Vision benefits for you and your family.
Life Insurance and Disability Benefits.
Retirement Plan (e.g., 401K, pension).
Parental Leave.
Fertility and family building benefits through Carrot.
Daily lunches and snacks in the office.

Categories

AI & ML Backend