Agent Post-Training, Computer Use Research

about 3 hours ago

H1B Sponsor

Base Salary

$295k - $445k/yr

Responsibilities

Design and run experiments to improve agent behavior for complex computer use.
Own end-to-end improvements to the post-training stack, including RL and data pipelines.
Build evaluations and environments to identify model failures and create training data.
Collaborate with product teams to translate user needs into model improvements.
Work on early-training interventions that shape agent behavior.
Decide on integrations and capabilities for major model runs.
Enhance large-scale training machinery for reliability and efficiency.
Debug failures in models and develop concrete hypotheses for improvements.

Strong technical fundamentals in machine learning, software engineering, or related fields.
Hands-on experience with LLMs, RL, and model training systems.
Ability to tackle open-ended problems with research and engineering skills.
Focus on product impact and model behavior beyond just benchmarks.
Capability to transition from vague problems to concrete experiments.
Comfortable working across various teams and communicating effectively.
Willingness to build systems and processes as needed by the team.