Software Engineer, Inference - Multi Modal
OpenAI
9 months ago
San Francisco, CA, USA
Mid Level / Senior
Base Salary
$295k - $555k/yr
Responsibilities
- Design and implement inference infrastructure for large-scale multimodal models.
- Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.
- Enable experimental research workflows to transition into reliable production services.
- Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.
- Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.
Requirements
- Experience building and scaling inference systems for LLMs or multimodal models.
- Familiarity with GPU-based ML workloads and performance dynamics of large models.
- Comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling.
- Familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems.
- Ability to own problems end-to-end and operate in ambiguous, fast-moving spaces.
Categories
AI & MLBackendData Engineering