AI Research Engineer, Enterprise Evaluations

5 months ago

New York, NY, USA or San Francisco, CA, USA

Mid Level / Senior

H1B Sponsor

Base Salary

$179k - $224k/yr

Responsibilities

Collaborate with Scale’s Operations team and enterprise customers to create structured evaluation data.
Analyze feedback and data to refine evaluation frameworks and improve human-curated assessments.
Design and develop LLM-as-a-Judge autorater frameworks and AI-assisted evaluation systems.
Pursue research initiatives to explore new methodologies for evaluating enterprise agents.

Requirements

Bachelor’s degree in Computer Science, Electrical Engineering, or a related field.
2+ years of experience in Machine Learning or Applied Research.
Hands-on experience with Large Language Models and Generative AI.
Strong understanding of model evaluation methodologies and current research.
Proficiency in Python and major ML frameworks like PyTorch or TensorFlow.
Solid foundation in engineering and statistical analysis.

Benefits

Comprehensive health, dental, and vision coverage.
Retirement benefits and a learning and development stipend.
Generous PTO and potential commuter stipend.

Tech Stack

PythonPyTorchTensorFlow

Categories

AI & MLData Science