AI Engineer, Evaluation & Quality - 11319

about 19 hours ago

Bengaluru, India

Mid Level / Senior

H1B Sponsor

Responsibilities

Build and maintain automated evaluation pipelines for AI model quality.
Implement task-specific benchmarks and test suites.
Design quality dashboards tracking accuracy, regression, and safety metrics.
Implement automated regression testing for every model iteration.
Build comparison frameworks for side-by-side evaluation of model variants.
Analyze evaluation results to identify failure modes and report to the ML team.
Maintain evaluation datasets: versioning, quality validation, coverage analysis.
Support A/B testing infrastructure for production model validation.

3+ years of software engineering or quality engineering experience.
Proficiency in Python with strong testing and automation skills.
Experience with statistical analysis and data visualization.
Understanding of ML model evaluation concepts (precision, recall, F1, human eval).
Experience building automated test frameworks and CI/CD pipelines.
Familiarity with dashboarding tools.
Strong analytical and problem-solving skills.
BS in Computer Science, Statistics, or equivalent experience.

Python