about 5 hours ago
Base Salary
$275k - $325k/yr
Responsibilities
- Design and build Hazel's evals platform end-to-end.
- Build production observability and monitoring for AI quality.
- Architect data curation pipelines for evaluation datasets.
- Develop LLM verification agents to catch errors and compliance violations.
- Integrate evals into the deployment pipeline for quality assurance.
- Partner with teams to evaluate cross-model performance.
- Define quality SLOs and build alerting for regressions.
- Establish eval methodology as a competitive advantage.
Requirements
- 8+ years of engineering experience, with 2 years focused on evaluation infrastructure.
- Deep familiarity with evaluation and scoring methodologies for AI systems.
- Experience designing and curating golden datasets.
- Comfort working across the stack including data engineering and backend integration.
- Strong communication skills to translate domain requirements into measurable criteria.
- A bias toward shipping and building user-friendly tools.
Benefits
- Hybrid work schedule to promote collaboration.
- Stunning office spaces designed for comfort and productivity.
- Competitive pay and equity for eligible positions.
- Premium healthcare, dental, and vision insurance plans.
- 401k savings plan with a 4% match and immediate vesting.
- 16 week paid parental leave after one year of employment.
- Professional growth and development opportunities.
- Company perks program including discounts on various services.
- Financial guidance program for personal finance management.
- One month work from anywhere policy.
