Research Engineer, Model Evaluations

about 2 months ago

Remote, Worldwide +2 moreMid Level

H1B Sponsor

Base Salary

$320k - $485k/yr

Responsibilities

Design and run evaluations of Claude's capabilities, producing visualizations for stakeholders.
Build and maintain a distributed evaluation execution platform for reliable performance.
Manage dashboards for monitoring model health during training.
Debug evaluation results during training runs and communicate findings under pressure.
Enhance tools and workflows for researchers to implement evaluations.
Collaborate with research teams to define metrics and interpret results.
Conduct experiments to analyze the impact of various factors on evaluation results.
Communicate evaluation results to internal and external audiences.

Requirements

Strong Python programming skills, including experience with production or research infrastructure.
Experience with distributed systems, data pipelines, or reliable infrastructure at scale.
Excellent written and verbal communication skills, especially for non-specialist audiences.
Ability to operate in an on-call or production-support capacity during live training.
A commitment to the societal impacts of AI and a desire to ensure its safety and benefits.

Benefits

Competitive compensation and benefits.
Optional equity donation matching.
Generous vacation and parental leave.
Flexible working hours.
Collaborative office space.

Tech Stack

Categories

AI & MLData Science