about 4 hours ago
Bellevue, WA, USA or Menlo Park, CA, USA
Staff+
H1B Sponsor
Base Salary
$199k - $300k/yr
Responsibilities
- Define and implement evaluation frameworks for agent performance.
- Evaluate models based on quality, latency, cost, and edge cases.
- Collaborate with product managers, data scientists, and engineers to set launch criteria.
- Analyze production issues and prioritize improvements for system reliability.
- Build metrics and reporting to enhance visibility into agent performance.
Requirements
- Deep experience in defining and measuring quality for machine learning systems.
- Experience evaluating large language models and understanding performance tradeoffs.
- Ability to analyze production issues and lead quality improvement initiatives.
- Comfortable collaborating with engineers, data scientists, and product partners.
- Experience in regulated environments or with AI evaluation tools is a plus.
Benefits
- Challenging, high-impact work to advance your career.
- Performance-driven compensation with bonuses and equity ownership.
- 100% paid health insurance for employees and 90% for dependents.
- Flexible benefits spending account for wellness and learning.
- Employer-paid life and disability insurance, fertility, and mental health benefits.
- Generous time off policies including holidays, paid time off, and parental leave.
- Exceptional office experience with catered meals and comfortable workspaces.
Categories
AI & MLData Science