GrepJob
Exa

Software Engineer, Infrastructure

Exa
Apply
4 months ago
Singapore, SingaporeMid Level / Senior
H1B Sponsor

Responsibilities

  • Build Kubernetes orchestration on a $20M GPU cluster.
  • Scale AWS batch job system for map-reduce jobs over thousands of machines.
  • Design GPU scheduling software for optimal cluster utilization.
  • Implement observability in production systems.

Requirements

  • Experience designing and operating large-scale infrastructure, such as GPU clusters or Kubernetes.
  • Strong focus on reliability, observability, and optimization across the stack.

Benefits

  • Premium healthcare benefits including medical, dental, and vision.
  • Fertility benefits offered to all employees.
  • Monthly wellness stipend provided.

Categories

AI & MLData EngineeringDevOps