Senior / Lead Machine Learning Engineer, Serving - Germany

about 1 month ago

Berlin, GermanySenior / Staff+

Responsibilities

Optimize real-time inference systems and multimodal models.
Take ownership of machine learning models from research to production.
Design benchmarks or prototypes to clarify unclear problems.
Collaborate with US-based leadership and engineering teams.
Ensure performance, latency, and reliability are prioritized in product features.

Requirements

Deep understanding of inference optimization techniques like vLLM or TRT-LLM.
Hands-on experience with model acceleration methods such as quantization and distillation.
Proficiency in C++, CUDA, Rust, or optimized Python for high-performance systems.
Experience with distributed systems, Kubernetes, and multi-GPU inference.
PhD in CS, Physics, Math, or equivalent practical experience in backend or ML systems.
Professional fluency in English, both written and spoken.

Benefits

Full U.S. visa and relocation support may be available for candidates interested in relocating to the San Francisco Bay Area.

Tech Stack

C++KubernetesPython Rust

Categories

AI & ML Backend