Researcher, Computer Use - Agent Post-Training

about 3 hours ago

H1B Sponsor

Base Salary

$250k - $380k/yr

Responsibilities

Design and run experiments to improve agent behavior for complex computer use.
Own end-to-end improvements to the post-training stack, including RL and data pipelines.
Build evaluations and environments to identify model failures and create training data.
Collaborate with product teams to translate user needs into model improvements.
Work on early-training and alignment interventions to shape agent behavior.
Decide on integrations and capabilities for major model runs.
Enhance machinery for large-scale training and launch.
Debug failures in shipped models and develop concrete hypotheses for fixes.

Strong technical fundamentals in machine learning, software engineering, or related fields.
Hands-on experience with LLMs, RL, and production ML systems.
Ability to tackle open-ended problems with research and engineering skills.
Focus on product impact and model behavior beyond just benchmarks.
Capability to move from vague problems to concrete experiments.
Comfortable working across various teams and communicating effectively.
Willingness to build systems and processes as needed.