Agent Post-Training, Context Research

about 3 hours ago

H1B Sponsor

Base Salary

$295k - $445k/yr

Responsibilities

Design and run experiments to improve scaling of compute on context.
Own end-to-end improvements to the post-training stack, including RL and data pipelines.
Build evals and environments to identify model failures and turn them into training data.
Collaborate with product teams to translate user needs into model improvements.
Work on early-training and alignment interventions to shape agent behavior.
Decide on integrations and capabilities for major model runs.
Enhance machinery for large-scale training and launch.
Debug failures in shipped models and develop concrete hypotheses and fixes.

Strong technical fundamentals in machine learning, software engineering, or related fields.
Hands-on experience with LLMs, RL, and model training systems.
Ability to tackle open-ended problems with research taste and engineering execution.
Focus on product impact and model behavior beyond just benchmarks.
Skill in moving from vague problems to concrete experiments and analyses.
Comfortable working across various domains and communicating effectively.
Willingness to build systems and processes as needed by the team.