GrepJob
CoreWeave

Staff Software Engineer, Applied Training

CoreWeave
Apply
5 days ago
Bellevue, WA, USA +2 moreStaff+

Base Salary

$165k - $242k/yr

Responsibilities

  • Contribute to the roadmap for Applied Training by identifying key workload unlocks.
  • Design and build a complete research cluster experience, including CLI and job configuration.
  • Own the Python SDK for sandbox infrastructure and ensure integration with Kubernetes clusters.
  • Write documentation for running popular open-source training frameworks on CoreWeave.
  • Collaborate with infrastructure teams and customers to understand their supercomputing needs.

Requirements

  • 8-12+ years of experience in building distributed systems or ML infrastructure.
  • Real Kubernetes experience, including custom controllers and workload orchestration.
  • Strong understanding of what makes researchers productive in their workflows.
  • Familiarity with distributed job scheduling and large-scale training challenges.
  • Proven track record of shipping production infrastructure relied upon by users.

Benefits

  • 100% paid medical, dental, and vision insurance.
  • Company-paid life insurance and short/long-term disability insurance.
  • Flexible Spending Account and Health Savings Account options.
  • Tuition reimbursement and participation in Employee Stock Purchase Program.
  • Mental wellness benefits and family-forming support.
  • Flexible PTO and catered lunch each day.

Tech Stack

KubernetesPythonPyTorch

Categories

AI & MLData EngineeringDevOps