OpenAI

Software Engineer, Inference - Multi Modal

OpenAI

Apply
9 months ago
San Francisco, CA, USA
Mid Level / Senior

Base Salary

$295k - $555k/yr

Responsibilities

  • Design and implement inference infrastructure for large-scale multimodal models.
  • Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.
  • Enable experimental research workflows to transition into reliable production services.
  • Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.
  • Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.

Requirements

  • Experience building and scaling inference systems for LLMs or multimodal models.
  • Familiarity with GPU-based ML workloads and performance dynamics of large models.
  • Comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling.
  • Familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems.
  • Ability to own problems end-to-end and operate in ambiguous, fast-moving spaces.

Categories

AI & MLBackendData Engineering