AI Infrastructure Engineer, Model Serving Platform
Scale AI
about 1 year ago
New York, NY, USA or San Francisco, CA, USA
Mid Level / Senior
H1B Sponsor
Base Salary
$175k - $220k/yr
Responsibilities
- Develop re-usable platforms for running in-house and open-source LLM-benchmarks.
- Ensure correctness and performance of post-training and evaluation jobs on the platform.
- Improve APIs for managing ML workflows.
- Contribute to foundational infrastructure for model inference and training.
- Participate in the team's on-call process to ensure service availability.
- Own projects end-to-end, from requirements to implementation.
Requirements
- 4+ years of experience developing ML platforms.
- Strong fundamentals in machine learning and backend system design.
- Experience training and/or benchmarking LLMs.
- Proficiency in Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform).
- Passion for collaborating with researchers to drive business impact.
Benefits
- Comprehensive health, dental, and vision coverage.
- Retirement benefits.
- Learning and development stipend.
- Generous PTO.
- Potential commuter stipend.
Tech Stack
AWSDockerGoogle Cloud PlatformKubernetesPythonTerraform
Categories
AI & MLBackendData ScienceDevOps