GrepJob
Relace

Infrastructure Engineer

Relace
Apply
6 months ago

Responsibilities

  • Design and operate high-performance inference and training infrastructure.
  • Build reliable systems for deploying and scaling ML workloads globally.
  • Work on GPU scheduling and distributed systems.
  • Optimize performance and cost across compute, networking, and storage layers.
  • Collaborate with engineers to enhance the capabilities of small models.

Requirements

  • 2+ years of experience writing high-quality production code.
  • Strong experience with cloud infrastructure (AWS, GCP, Azure, or equivalent).
  • Experience with data science and systems optimization.
  • Familiarity with ML infrastructure and GPUs is a plus.
  • Willingness to work out of the SF office in FiDi.

Tech Stack

AWSAzureGoogle Cloud Platform

Categories

AI & MLData EngineeringDevOps