GrepJob
Anthropic

Research Engineer, Model Evaluations

Anthropic
Apply
about 3 hours ago
New York, NY, USA +2 more
Mid Level
H1B Sponsor

Base Salary

$320k - $485k/yr

Responsibilities

  • Design and run evaluations of Claude's capabilities, producing visualizations for stakeholders.
  • Build and maintain a distributed evaluation execution platform for reliable performance.
  • Manage dashboards for monitoring model health during training.
  • Debug evaluation results during training runs and communicate findings under pressure.
  • Enhance tools and workflows for researchers to implement evaluations.
  • Collaborate with research teams to define metrics and interpret results.
  • Conduct experiments to analyze the impact of various factors on evaluation results.
  • Communicate evaluation results to internal and external audiences.

Requirements

  • Strong Python programming skills, including experience with production or research infrastructure.
  • Experience with distributed systems, data pipelines, or reliable infrastructure at scale.
  • Excellent written and verbal communication skills, especially for non-specialist audiences.
  • Ability to operate in an on-call or production-support capacity during live training.
  • A commitment to the societal impacts of AI and a desire to ensure its safety and benefits.

Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Collaborative office space.

Tech Stack

Python

Categories

AI & MLData Science