AI Infrastructure Engineer (GPU) - Remote EMEA

17 days ago

Prague, Czechia +8 moreMid Level / Senior

Responsibilities

Build and operate production-grade model serving infrastructure using frameworks like vLLM, TGI, or Triton.
Design and implement robust deployment pipelines with blue/green and canary rollout strategies.
Develop and maintain auto-scaling systems and intelligent request routing layers.
Optimize GPU utilization, memory efficiency, and network throughput.
Design observability systems for tracking inference metrics and system health.
Manage model registries and CI/CD pipelines for automated deployments.
Own the full lifecycle of ML systems, including operational support.
Define engineering best practices and contribute to platform scalability.

4+ years of experience in ML Ops, Platform Engineering, or similar roles focused on ML systems.
Hands-on experience with model serving frameworks like vLLM, TGI, or Triton.
Strong background in container orchestration and GPU-based workloads.
Experience with MLOps tooling including model registries and automated deployment pipelines.
Proficiency in Python and infrastructure-as-code tools like Terraform or Helm.
Strong understanding of distributed systems and production reliability engineering.
Ability to effectively use AI coding assistants for development and debugging.
Ownership mindset with the ability to operate independently in a remote environment.

Take ownership of critical infrastructure for a rapidly scaling AI-native cloud platform.
Build foundational ML inference systems from the ground up.
Work at the intersection of distributed systems, GPU computing, and sustainable cloud architecture.
Gain deep expertise in next-generation AI infrastructure and large-scale model serving systems.
Influence core engineering decisions and define scalable best practices.

HelmMLflowPythonTerraform