GrepJob
Exa

Software Engineer, Infrastructure

Exa
Apply
10 months ago
San Francisco, CA, USAMid Level / Senior
H1B Sponsor

Base Salary

$180k - $350k/yr

Responsibilities

  • Build Kubernetes orchestration on a $20m GPU cluster.
  • Scale AWS batch job system for map-reduce jobs over tens of thousands of machines.
  • Design GPU scheduling software for optimal cluster utilization.
  • Implement observability in production systems.

Requirements

  • Experience designing and operating large-scale infrastructure such as GPU clusters or Kubernetes clusters.
  • Strong focus on reliability, observability, and optimization across the stack.

Benefits

  • In-person opportunity in San Francisco.
  • Open to sponsoring international candidates (e.g., STEM OPT, OPT, H1B, O1, E3).

Categories

AI & MLData EngineeringDevOps