Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI

8 months ago

San Francisco, CA, USA or New York, NY, USAMid Level / Senior

H1B Sponsor

Base Salary

$181k - $315k/yr

Responsibilities

Build, profile, and optimize the training and inference framework.
Post-train state-of-the-art models to define stable post-training recipes.
Collaborate with ML teams to accelerate research and development.
Create a next-gen agent training algorithm for multi-agent/multi-tool rollouts.

Requirements

1-3 years of LLM training experience in a production environment.
Passion for system optimization.
Experience with post-training methods like RLHF/RLVR and algorithms like PPO/GRPO.
Ability to operate the architecture of modern GPU clusters.
Experience with multi-node LLM training and inference.
Strong software engineering skills, proficient in CUDA, PyTorch, and transformers.
Strong written and verbal communication skills.
PhD or Masters in Computer Science or a related field.

Benefits

Comprehensive health, dental, and vision coverage.
Retirement benefits.
Learning and development stipend.
Generous PTO.
Potential commuter stipend.

Tech Stack

Categories

AI & MLData Science