Scale AI

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI

Apply
3 months ago
New York, NY, USA +3 more
Mid Level / Senior
H1B Sponsor

Base Salary

$187k - $300k/yr

Responsibilities

  • Develop and maintain automated evaluation pipelines for ML models across various metrics.
  • Design test datasets and benchmarks to measure model performance and safety.
  • Build evaluation frameworks for LLM agents, including scenario-based testing.
  • Conduct comparative analyses of model architectures and evaluation outcomes.
  • Implement tools for continuous monitoring and quality assurance of ML systems.
  • Design and execute stress tests to uncover vulnerabilities in models.
  • Collaborate with operations teams to produce high-quality evaluation datasets.

Requirements

  • Experience in computer vision, deep learning, reinforcement learning, or NLP in production settings.
  • Strong programming skills in Python; experience with TensorFlow or PyTorch.
  • Background in algorithms, data structures, and object-oriented programming.
  • Experience with LLM pipelines, simulation environments, or automated evaluation systems.
  • Ability to convert research insights into measurable evaluation criteria.
  • Active security clearance or ability to obtain one.

Benefits

  • Comprehensive health, dental, and vision coverage.
  • Retirement benefits and a learning and development stipend.
  • Generous PTO and potential commuter stipend.

Tech Stack

AWSGoogle Cloud PlatformPythonPyTorchTensorFlow

Categories

AI & MLData ScienceTesting