about 2 hours ago
Bengaluru, India
Senior / Staff+
H1B Sponsor
Responsibilities
- Build and run the LLM control plane/gateway with smart routing and cost tracking.
- Ship a unified API and SDKs with normalized schemas and full observability.
- Enforce safety and privacy by default through content filtering and validation.
- Enable multi-model, multi-vendor use of LLMs with automated canarying.
- Own the agent runtime including tool registry and function calling.
- Design orchestration patterns and manage agent state and workflows.
- Create components for monitoring model and data drift.
- Add human-in-the-loop review before agents interact with dealer systems.
- Evolve the domain graph and build reliable data ingestion pipelines.
- Serve real-time context to agents with access controls.
- Power retrieval with hybrid search and smart caching.
- Run continuous evaluations for quality and safety of the platform.
- Define SLOs for latency, uptime, and cost capabilities.
- Maintain a model/agent registry and support compliance.
- Provide templates and documentation for product teams.
Requirements
- 5+ years of experience building large-scale data/ML or platform systems.
- Strong software engineering fundamentals including API design and distributed systems.
- Production experience with Python and one of Java/Scala/Go.
- Experience with MLOps at scale including pipelines and CI/CD for models.
- Familiarity with cloud services, preferably AWS, and container technologies.
- Practical ML knowledge including feature engineering and model deployment.
- Experience building or operating an LLM gateway/control plane.
- Knowledge of agentic systems and orchestration frameworks.
Tech Stack
Apache AirflowApache FlinkApache KafkaApache SparkAWSDockerGoGraphQLJavaKubernetesMLflowNeo4jPythonScala
Categories
AI & MLBackendData Science