TL, Research Inference

about 2 months ago

Base Salary

$380k - $555k/yr

Responsibilities

Design and build high-performance inference runtimes for large-scale AI models.
Own and optimize core execution paths, including model execution and memory management.
Develop and improve distributed inference across multiple GPUs.
Implement and optimize inference-critical operators and kernels.
Partner with research teams to support new model architectures in inference systems.
Diagnose and resolve performance bottlenecks through profiling and debugging.
Contribute to the observability and reliability of large-scale AI systems.

Experience building production inference systems.
Comfortable with GPU-centric performance engineering.
Experience with multi-GPU or distributed systems.
Ability to reason end-to-end about inference pipelines.
Understanding of research ideas and their implementation within system constraints.
Enjoy solving complex systems problems that emerge at scale.
Prefer hands-on technical ownership over abstract design work.