GrepJob
Parspec

AI Ops Engineer

Parspec
Apply
about 2 months ago
Bengaluru, IndiaSenior

Responsibilities

  • Design and build document AI platforms powered by generative AI.
  • Implement event-driven and queue-based systems for scalable AI workflows.
  • Architect and maintain self-hosted LLM infrastructure on AWS.
  • Manage production systems for LLM serving and AI workflow orchestration.
  • Develop monitoring systems to reduce hallucinations and unsafe outputs.
  • Implement end-to-end observability for AI/ML pipelines.
  • Track performance metrics for AI systems.
  • Manage machine learning workflows and enable experiment tracking.
  • Implement AI platform security controls and optimize AWS infrastructure.

Requirements

  • Strong experience with AWS cloud infrastructure and services.
  • Experience building ML infrastructure using Infrastructure-as-Code tools.
  • Hands-on experience deploying LLM serving infrastructure.
  • Experience managing vector databases and retrieval systems.
  • Strong experience designing event-driven or asynchronous systems.
  • Experience implementing observability for distributed AI systems.
  • Strong programming experience in Python and asynchronous programming.
  • Experience with Docker, Kubernetes, and CI/CD pipelines.
  • 5+ years of experience in MLOps, LLMOps, AIOps, or DevOps.

Benefits

  • Competitive salary and benefits including family insurance coverage.
  • Free health teleconsultations and learning/upskilling budgets.
  • Equity in the company.
  • Flexible hours and a hybrid work setup.
  • Unlimited PTO.
  • Opportunity to grow with a fast-scaling company.

Tech Stack

Apache KafkaAWSDockerFastAPIGitHub ActionsGrafanaKubernetesMLflowPrometheusPythonTerraform

Categories

AI & MLData EngineeringDevOps