
AI Infrastructure Engineer (GPU) - Remote EMEA
Pragmatike17 days ago
Prague, Czechia +8 moreMid Level / Senior
Responsibilities
- Build and operate production-grade model serving infrastructure using frameworks like vLLM, TGI, or Triton.
- Design and implement robust deployment pipelines with blue/green and canary rollout strategies.
- Develop and maintain auto-scaling systems and intelligent request routing layers.
- Optimize GPU utilization, memory efficiency, and network throughput.
- Design observability systems for tracking inference metrics and system health.
- Manage model registries and CI/CD pipelines for automated deployments.
- Own the full lifecycle of ML systems, including operational support.
- Define engineering best practices and contribute to platform scalability.
Requirements
- 4+ years of experience in ML Ops, Platform Engineering, or similar roles focused on ML systems.
- Hands-on experience with model serving frameworks like vLLM, TGI, or Triton.
- Strong background in container orchestration and GPU-based workloads.
- Experience with MLOps tooling including model registries and automated deployment pipelines.
- Proficiency in Python and infrastructure-as-code tools like Terraform or Helm.
- Strong understanding of distributed systems and production reliability engineering.
- Ability to effectively use AI coding assistants for development and debugging.
- Ownership mindset with the ability to operate independently in a remote environment.
Benefits
- Take ownership of critical infrastructure for a rapidly scaling AI-native cloud platform.
- Build foundational ML inference systems from the ground up.
- Work at the intersection of distributed systems, GPU computing, and sustainable cloud architecture.
- Gain deep expertise in next-generation AI infrastructure and large-scale model serving systems.
- Influence core engineering decisions and define scalable best practices.