Machine Learning Engineer - Model Evaluations, Public Sector
Scale AI
3 months ago
New York, NY, USA +3 more
Mid Level / Senior
H1B Sponsor
Base Salary
$187k - $300k/yr
Responsibilities
- Develop and maintain automated evaluation pipelines for ML models across various metrics.
- Design test datasets and benchmarks to measure model performance and safety.
- Build evaluation frameworks for LLM agents, including scenario-based testing.
- Conduct comparative analyses of model architectures and evaluation outcomes.
- Implement tools for continuous monitoring and quality assurance of ML systems.
- Design and execute stress tests to uncover vulnerabilities in models.
- Collaborate with operations teams to produce high-quality evaluation datasets.
Requirements
- Experience in computer vision, deep learning, reinforcement learning, or NLP in production settings.
- Strong programming skills in Python; experience with TensorFlow or PyTorch.
- Background in algorithms, data structures, and object-oriented programming.
- Experience with LLM pipelines, simulation environments, or automated evaluation systems.
- Ability to convert research insights into measurable evaluation criteria.
- Active security clearance or ability to obtain one.
Benefits
- Comprehensive health, dental, and vision coverage.
- Retirement benefits and a learning and development stipend.
- Generous PTO and potential commuter stipend.
Tech Stack
AWSGoogle Cloud PlatformPythonPyTorchTensorFlow
Categories
AI & MLData ScienceTesting