GrepJob
Inworld

Staff / Principal Machine Learning Engineer, Serving - UK

Inworld
Apply
about 1 month ago
London, United KingdomStaff+ / Senior

Responsibilities

  • Develop and optimize real-time multimodal models and serving frameworks.
  • Implement techniques for inference optimization and model acceleration.
  • Profile and enhance performance of systems using C++, CUDA, Rust, or optimized Python.
  • Manage distributed systems and scaling for high-concurrency environments.
  • Take ownership of models from research to production, ensuring reliability.

Requirements

  • Deep understanding of modern serving frameworks like vLLM or TRT-LLM.
  • Hands-on experience with quantization, distillation, and caching strategies.
  • Proficiency in high-performance programming languages and profiling code.
  • Experience with Kubernetes, Ray, and multi-GPU/multi-node inference.
  • PhD in CS, Physics, Math, or equivalent practical experience.

Benefits

  • Competitive salary range of £140,000 – £200,000.
  • Equity and additional benefits included in total compensation.
  • Support for open-source contributions and sharing work.

Tech Stack

C++KubernetesPythonRust

Categories

AI & MLBackendData Engineering