4 months ago
Base Salary
$150k - $300k/yr
Responsibilities
- Design, build, and maintain evaluation benchmarks for model performance.
- Develop metrics and workflows to identify new failure modes in datasets.
- Collaborate with ML engineers to translate evaluation insights into model improvements.
- Work with unstructured data to uncover edge cases and hard examples.
- Build internal and user-facing tools for result inspection and analysis.
- Engage with customers to understand data needs and create tailored benchmarks.
Requirements
- High standards for quality and precision in work.
- Strong problem-solving skills and ability to build from first principles.
- Proficient in Python with the ability to create reliable technical solutions.
- Experience with data infrastructure like AWS S3 and analytics systems.
- Comfortable working with unstructured data and identifying failure cases.
- Ability to collaborate across technical and non-technical teams.
Benefits
- Unlimited PTO for work-life balance.
- Free daily lunch with teammates at the office.
- Reimbursed transportation costs.
- Generous health insurance covering medical, dental, and vision.
- Health and wellness budget of up to $150/month.
- Flexible parental leave options.
