about 3 hours ago
Base Salary
$250k - $380k/yr
Responsibilities
- Design and run experiments to improve agent behavior for complex computer use.
- Own end-to-end improvements to the post-training stack, including RL and data pipelines.
- Build evaluations and environments to identify model failures and create training data.
- Collaborate with product teams to translate user needs into model improvements.
- Work on early-training and alignment interventions to shape agent behavior.
- Decide on integrations and capabilities for major model runs.
- Enhance machinery for large-scale training and launch.
- Debug failures in shipped models and develop concrete hypotheses for fixes.
Requirements
- Strong technical fundamentals in machine learning, software engineering, or related fields.
- Hands-on experience with LLMs, RL, and production ML systems.
- Ability to tackle open-ended problems with research and engineering skills.
- Focus on product impact and model behavior beyond just benchmarks.
- Capability to move from vague problems to concrete experiments.
- Comfortable working across various teams and communicating effectively.
- Willingness to build systems and processes as needed.