GrepJob
Anthropic

Software Engineer, Safeguards Evals

Anthropic
Apply
about 17 hours ago
San Francisco, CA, USA or New York, NY, USAMid Level / Senior
H1B Sponsor

Base Salary

$320k - $485k/yr

Responsibilities

  • Build and own the evaluation harness for an agentic investigation system.
  • Construct high-quality eval datasets representing real-world misuse.
  • Measure agent performance end-to-end and drive improvements.
  • Analyze coverage to identify measurement gaps and evolve evaluations.
  • Productionize successful research into regression and release pipelines.
  • Build tooling for policy experts to run evaluations independently.
  • Construct RL environments to enhance safety investigation capabilities.

Requirements

  • Proficiency in Python and comfort working across the stack.
  • Experience building and maintaining data pipelines.
  • Experience with LLMs and understanding their capabilities and failure modes.
  • Strong data analysis skills to draw insights from large datasets.
  • Ability to transition between research prototyping and production-quality code.
  • Ability to translate ambiguous problems into concrete, testable experiments.

Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Collaborative office space.

Tech Stack

Categories

AI & MLData ScienceTesting