ML Platform / MLOps Engineer

about 1 month ago

Oakland, CA, USAMid Level / Senior

Base Salary

$180k - $250k/yr

Responsibilities

Develop infrastructure for large-scale ML training and inference on GPU clusters.
Implement and maintain security best practices across ML infrastructure.
Monitor and optimize infrastructure performance, reliability, and cost.
Evaluate open source infrastructure solutions and cloud providers.
Build and maintain machine learning pipelines for model inference.
Implement CI/CD pipelines for machine learning models and services.
Develop tooling to help researchers transition from experiments to production models.

BS in Computer Science or a related field.
3+ years of experience building or operating production ML systems.
Experience with MLOps, ML infrastructure, or ML platform engineering.
Strong experience with cloud infrastructure, preferably GCP.
Experience with containerized workloads and orchestration systems like Kubernetes and Docker.
Experience building data or ML pipelines.
Familiarity with CI/CD and infrastructure-as-code practices.

DockerGoogle Cloud PlatformKubernetesMLflowPyTorch