
AI Engineer, AIOps & Infrastructure
Eloquent AI9 months ago
Responsibilities
- Design and build scalable ML infrastructure for AI agents in production.
- Automate LLMOps and MLOps workflows for model training and deployment.
- Optimize GPU and cloud compute workloads for large-scale AI systems.
- Develop Kubernetes-based solutions for ML model orchestration.
- Implement logging, monitoring, and performance tracking for AI models.
- Streamline data pipelines and model serving with ML and engineering teams.
- Ensure security, compliance, and reliability in AI infrastructure.
- Participate in on-call rotations for 24/7 reliability of AI systems.
Requirements
- 5+ years of experience in software engineering, MLOps, or infrastructure development.
- Strong expertise in Kubernetes and managing containerized ML workloads.
- Deep understanding of cloud platforms like AWS, GCP, or Azure.
- Proficiency in Python for developing ML/AI application services.
- Experience with ML model deployment pipelines and monitoring.
- Familiarity with vector databases and RAG architectures is a plus.
- Strong problem-solving skills in a high-scale AI environment.