
Machine Learning Infrastructure Engineer
Character.AIabout 1 year ago
Foster City, CA, USAMid Level / Senior
H1B Sponsor
Base Salary
$150k - $350k/yr
Responsibilities
- Provide infrastructure support to ML research and product.
- Build tooling to diagnose cluster issues and hardware failures.
- Monitor deployments and manage experiments.
- Maximize GPU allocation and utilization for serving and training.
Requirements
- 4+ years of experience supporting infrastructure within an ML environment.
- Experience in developing tools for diagnosing ML infrastructure problems.
- Experience with cloud platforms like Compute Engine and Kubernetes.
- Experience working with GPUs.
Tech Stack
KubernetesPyTorchTensorFlow
Categories
AI & MLData Engineering