Agent Post-Training, API & Power Users

about 4 hours ago

H1B Sponsor

Base Salary

$295k - $445k/yr

Responsibilities

Design and run experiments to improve model behavior in API and power-user workflows.
Build evals, graders, and environments based on real developer workflows.
Partner with users to identify behavior gaps and convert product signals into interventions.
Improve model behavior in multi-step tasks and error recovery.
Own end-to-end model behavior projects from analysis to launch readiness.
Develop feedback loops using user traces and API patterns to discover model gaps.
Decide on agentic capabilities and behavioral fixes for major model runs.
Debug failures in shipped models by analyzing traces and evals.
Work on early-training and alignment interventions to shape agent behavior.
Enhance the machinery for large-scale training and launch.

Strong technical fundamentals in ML, software engineering, systems, or applied research.
Hands-on experience with LLMs, post-training, and production ML systems.
Ability to form hypotheses about model behavior from transcripts and eval failures.
Excitement for solving ambiguous capability problems with noisy signals.
Deep care for developer and expert-user experience in real workflows.
Comfortable working across research, product, infrastructure, and safety boundaries.
Willingness to build systems and processes as needed by the team.
Desire to train and ship models that are genuinely useful for various users.