Scale AI

AI Infrastructure Engineer, Model Serving Platform

Scale AI

Apply
about 1 year ago
New York, NY, USA or San Francisco, CA, USA
Mid Level / Senior
H1B Sponsor

Base Salary

$175k - $220k/yr

Responsibilities

  • Develop re-usable platforms for running in-house and open-source LLM-benchmarks.
  • Ensure correctness and performance of post-training and evaluation jobs on the platform.
  • Improve APIs for managing ML workflows.
  • Contribute to foundational infrastructure for model inference and training.
  • Participate in the team's on-call process to ensure service availability.
  • Own projects end-to-end, from requirements to implementation.

Requirements

  • 4+ years of experience developing ML platforms.
  • Strong fundamentals in machine learning and backend system design.
  • Experience training and/or benchmarking LLMs.
  • Proficiency in Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform).
  • Passion for collaborating with researchers to drive business impact.

Benefits

  • Comprehensive health, dental, and vision coverage.
  • Retirement benefits.
  • Learning and development stipend.
  • Generous PTO.
  • Potential commuter stipend.

Tech Stack

AWSDockerGoogle Cloud PlatformKubernetesPythonTerraform

Categories

AI & MLBackendData ScienceDevOps