about 4 hours ago
Base Salary
$200k - $550k/yr
Responsibilities
- Build and maintain the internal evals platform used across Magic.
- Design, implement, and validate eval tasks for various systems.
- Develop infrastructure for running large-scale evaluations.
- Build systems to measure dataset quality and identify improvement opportunities.
- Improve evaluation correctness, reproducibility, and reliability.
- Audit and enhance public benchmarks and evaluation methodologies.
- Partner with teams to define metrics that reflect model quality.
- Build tooling and frameworks for trustworthy measurements.
Requirements
- Strong software engineering fundamentals.
- Experience building production systems or internal platforms.
- Exceptional attention to detail and a high bar for correctness.
- Experience with machine learning systems and evaluation frameworks.
- Ability to critically reason about benchmarks and metrics.
- Strong intuition for measurement quality and experimental design.
- Experience operating systems that run at scale.
- Strong debugging and investigative skills.
- Comfortable navigating ambiguity in measurement.
- Skepticism toward unvalidated results.
- Track record of owning technical projects end-to-end.
- Excitement about enabling better decision-making through measurements.
Benefits
- Annual salary range between $200K - $550K depending on experience.
- Equity is a significant part of total compensation.
- 401(k) plan with 6% salary matching.
- Generous health, dental, and vision insurance for you and your dependents.
- Unlimited paid time off.
- Visa sponsorship and relocation support for candidates moving to San Francisco.
- A small, fast-moving, highly collaborative team.
