Software Engineer, Infrastructure

10 months ago

San Francisco, CA, USAMid Level / Senior

H1B Sponsor

Base Salary

$180k - $350k/yr

Responsibilities

Build Kubernetes orchestration on a $20m GPU cluster.
Scale AWS batch job system for map-reduce jobs over tens of thousands of machines.
Design GPU scheduling software for optimal cluster utilization.
Implement observability in production systems.

Requirements

Experience designing and operating large-scale infrastructure such as GPU clusters or Kubernetes clusters.
Strong focus on reliability, observability, and optimization across the stack.

Benefits

In-person opportunity in San Francisco.
Open to sponsoring international candidates (e.g., STEM OPT, OPT, H1B, O1, E3).

Tech Stack

AWS Kubernetes Rust

Categories

AI & MLData EngineeringDevOps