25 days ago
San Francisco, CA, USAEntry Level / Mid Level
Responsibilities
- Build diverse, high-fidelity environments for agent testing.
- Design complex tasks that require long-horizon reasoning and tool use.
- Develop robust verifiers to measure agent performance reliably.
- Improve infrastructure and tooling for running and debugging environments.
- Collaborate with the research team to identify failure modes and create new tasks.
Requirements
- Strong engineering fundamentals are essential.
- Ability to build from first principles and solve open-ended technical problems.
- High agency with a strong bias toward shipping.
- Commitment to high quality and building robust systems.
