Machine Learning Engineer I

Tekion

Apply

about 2 months ago

Bengaluru, India

Entry Level / Mid Level

H1B Sponsor

Responsibilities

Accelerate the rollout of LLM-powered and agent-driven features across Tekion products.
Enable agentic workflows that automate, reason, and interact on behalf of users and internal stakeholders.
Operationalize secure, compliant, and explainable LLM and agentic services at scale.
Convert Applied Sciences models into scalable, compliant, cost-efficient production services.
Standardize how models are trained, validated, deployed, and monitored across Tekion products.
Power real-time, context-aware experiences by integrating batch/stream features, graph context, and online inference.
Turn Applied Sciences prototype models into fast, reliable services with well-defined API contracts.
Integrate with the LLM Gateway/MCP, prompt/config versioning.
Build and orchestrate CI/CD pipelines.
Review data science models; refactor and optimize code; containerize; deploy; version; and monitor for quality.
Collaborate with data scientists, data engineers, product managers, and architects to design enterprise systems.
Monitor, detect, and mitigate risks unique to LLMs and agentic systems.
Implement prompt management: versioning, A/B testing, guardrails, and dynamic orchestration based on feedback and metrics.
Design batch/stream pipelines and online features linked to our domain graph.
Build inference microservices with schema versioning, structured outputs, and stringent p95 latency targets.
Manage the model/feature lifecycle: feature store strategy, model/agent registry, versioning, and lineage.
Instrument deep observability: traces/logs/metrics, data/feature drift, model performance, safety signals, and cost tracking.
Ensure real-time reliability: autoscaling, caching, circuit breakers, retries/fallbacks, and graceful degradation.
Develop templates/SDKs/CLIs, sandbox datasets, and documentation that make shipping ML the default path.

Requirements

2.5 years - 4 years in ML engineering/MLOps or backend/platform engineering with production ML.
Experience with LLMs, retrieval systems, vector stores, and graph/knowledge stores.
Strong software engineering fundamentals: Python plus one of Java/Go/Scala; API design; concurrency; testing.
Hands-on with orchestration frameworks and libraries.
Knowledge of agent architectures and safe execution patterns.
Experience with pipelines and data tools like Airflow/Kubeflow, Spark/Flink, Kafka/Kinesis.
Familiarity with microservices and runtime technologies like Docker/Kubernetes.
Experience with model ops practices including experiment tracking and drift detection.
Knowledge of observability tools like OpenTelemetry/Prometheus/Grafana.
Familiarity with cloud services, preferably AWS, and security/compliance practices.

Tech Stack

Amazon DynamoDBApache AirflowApache FlinkApache KafkaApache SparkAWSDockerGoGrafanagRPCJavaKubernetesMLflowPrometheusPythonScala

Machine Learning Engineer I

Responsibilities

Requirements

Tech Stack

Categories