Member of Technical Staff, Evals

about 2 months ago

San Francisco, CA, USAMid Level / Senior

H1B Sponsor

Base Salary

$200k - $550k/yr

Responsibilities

Build and maintain the internal evals platform used across Magic.
Design, implement, and validate eval tasks for various systems.
Develop infrastructure for running large-scale evaluations.
Build systems to measure dataset quality and identify improvement opportunities.
Improve evaluation correctness, reproducibility, and reliability.
Audit and enhance public benchmarks and evaluation methodologies.
Partner with teams to define metrics that reflect model quality.
Build tooling and frameworks for trustworthy measurements.

Requirements

Strong software engineering fundamentals.
Experience building production systems or internal platforms.
Exceptional attention to detail and a high bar for correctness.
Experience with machine learning systems and evaluation frameworks.
Ability to critically reason about benchmarks and metrics.
Strong intuition for measurement quality and experimental design.
Experience operating systems that run at scale.
Strong debugging and investigative skills.
Comfortable navigating ambiguity in measurement.
Skepticism toward unvalidated results.
Track record of owning technical projects end-to-end.
Excitement about enabling better decision-making through measurements.

Benefits

Annual salary range between $200K - $550K depending on experience.
Equity is a significant part of total compensation.
401(k) plan with 6% salary matching.
Generous health, dental, and vision insurance for you and your dependents.
Unlimited paid time off.
Visa sponsorship and relocation support for candidates moving to San Francisco.
A small, fast-moving, highly collaborative team.

Categories

AI & ML BackendData Science