Software Engineer, Inference – AMD GPU Enablement
OpenAI
5 months ago
San Francisco, CA, USA
Mid Level / Senior
Base Salary
$295k - $555k/yr
Responsibilities
- Own the bring-up, correctness, and performance of the OpenAI inference stack on AMD hardware.
- Integrate internal model-serving infrastructure into various GPU-backed systems.
- Debug and optimize distributed inference workloads across memory, network, and compute layers.
- Validate correctness, performance, and scalability of model execution on large GPU clusters.
- Collaborate with teams to design and optimize high-performance GPU kernels for accelerators.
- Build, integrate, and tune collective communication libraries for parallel model execution.
Requirements
- Experience writing or porting GPU kernels using HIP, CUDA, or Triton.
- Familiarity with communication libraries like NCCL/RCCL.
- Experience with distributed inference systems and scaling models across fleets of accelerators.
- Strong problem-solving skills for end-to-end performance challenges.
- Excitement to work in a small, fast-moving team building new infrastructure.
Categories
AI & MLBackend