ML Infrastructure Engineer, Safeguards

11 months ago

H1B Sponsor

Base Salary

$300k - $405k/yr

Responsibilities

Design and build scalable ML infrastructure for classifier and safety evaluations.
Build monitoring and observability tools for model performance and system health.
Collaborate with research teams to productionize safety research.
Optimize inference latency and throughput for real-time safety evaluations.
Implement automated testing, deployment, and rollback systems for ML models.
Partner with various teams to deliver infrastructure that meets safety needs.
Contribute to the development of internal tools and frameworks for safety research.

5+ years of experience building production ML infrastructure in safety-critical domains.
Proficient in Python and experienced with ML frameworks like PyTorch, TensorFlow, or JAX.
Hands-on experience with cloud platforms (AWS, GCP) and container orchestration (Kubernetes).
Understanding of distributed systems principles for high-throughput, low-latency workloads.
Experience with data engineering tools and building robust data pipelines.
Results-oriented with a focus on reliability in safety-critical systems.
Enjoy collaborating with researchers to translate research into production systems.
Care deeply about AI safety and its societal impacts.