GrepJob
Maven Robotics

ML Infrastructure Engineer

Maven Robotics
Apply
about 3 hours ago
San Francisco, CA, USAMid Level / Senior
H1B Sponsor

Responsibilities

  • Own the architecture, implementation, reliability, and evolution of machine learning infrastructure.
  • Build backend services for managing data, artifacts, jobs, logs, metadata, and compute resources.
  • Design scalable systems for workload orchestration, storage, observability, security, and automation.
  • Create intuitive internal tools that simplify complex infrastructure for engineers.
  • Lead discussions with cloud and ML compute providers regarding capacity planning and performance.

Requirements

  • Significant experience designing and operating production backend or compute infrastructure.
  • Proven track record of managing complex infrastructure projects from architecture to deployment.
  • Strong programming skills in Python, Go, Rust, C++, or similar languages.
  • Experience with GPU compute infrastructure and orchestrating workloads using Kubernetes or similar systems.
  • Familiarity with storage systems, observability platforms, and infrastructure-as-code.
  • Experience managing large-scale GPU fleets or hybrid cloud environments.
  • Ability to build internal developer platforms and self-service infrastructure tools.
  • Strong technical judgment and communication skills to drive decisions across teams.
  • Self-starter attitude with the ability to prioritize and deliver solutions in a startup environment.