GrepJob
Roku

Senior Software Engineer, Machine Learning (ML Ops)

Roku
Apply
about 2 hours ago
Bengaluru, IndiaSenior / Staff+
H1B Sponsor

Responsibilities

  • Lead the design and operation of scalable, production-grade cloud infrastructure for ML workloads across AWS and GCP.
  • Architect and improve CI/CD systems for ML models and platform services.
  • Own and evolve low-latency infrastructure for real-time model inference.
  • Define and enforce observability standards for ML systems.
  • Participate in on-call rotation for incident response and root-cause analysis.
  • Partner with data scientists and ML engineers to enhance platform usability.
  • Champion operational excellence through automation and continuous improvement.

Requirements

  • BS or MS in Computer Science, Engineering, or a related quantitative field.
  • 8+ years of experience in DevOps, SRE, or ML infrastructure.
  • Strong programming skills in Python and/or Scala or Java.
  • Deep experience with Kubernetes and container orchestration on GCP and/or AWS.
  • Expertise with NoSQL or low-latency data stores.
  • Hands-on experience with data and orchestration technologies.
  • Experience building and maintaining CI/CD systems using tools like Jenkins or GitLab Runner.
  • Familiarity with feature engineering platforms and model lifecycle tools.
  • Strong infrastructure-as-code experience with Terraform.
  • Experience with observability platforms such as Prometheus and Grafana.
  • Excellent communication and cross-functional collaboration skills.

Benefits

  • Global access to mental health and financial wellness support.
  • Comprehensive healthcare benefits including medical, dental, and vision.
  • Support for taking time off in accordance with local leave policies.
  • Retirement options including 401(k)/pension.

Tech Stack

Apache AirflowApache FlinkApache KafkaApache SparkAWSDatadogGoogle Cloud PlatformGrafanaJavaJenkinsKubernetesMLflowPrometheusPythonScalaTerraform

Categories