SWE, Applied Evals
OpenAI
6 months ago
San Francisco, CA, USA
Mid Level / Senior
Base Salary
$230k - $325k/yr
Responsibilities
- Define core evaluation signals for model improvement.
- Design reliable and extendable agents, harnesses, and eval pipelines.
- Prototype solutions with real workflows and create scalable feedback loops.
- Connect evaluation signals to research and training systems.
- Collaborate with engineering, research, and product teams on model deployment and measurement.
- Build reusable systems and tools for company-wide contributions.
Requirements
- 4+ years of experience in software engineering with a track record of shipping production systems.
- Experience building AI agents or applications, including designing evals.
- Familiarity with evaluation methods for LLMs and multi-agent workflows.
- Knowledge of deep learning concepts or prior exposure to training models.
- Strong communication skills for technical and non-technical audiences.
- Motivated by high-impact collaboration and able to thrive in ambiguity.
Benefits
- Hybrid work model with 3 days in the office per week.
- Relocation assistance offered.
Categories
AI & MLBackendFull Stack