GrepJob
CoreWeave

Senior Software Engineer II, Applied Training

CoreWeave
Apply
about 5 hours ago
Bellevue, WA, USA or Sunnyvale, CA, USASenior / Staff+
H1B Sponsor

Base Salary

$182k - $242k/yr

Responsibilities

  • Contribute to the roadmap for Applied Training to identify essential workloads.
  • Collaborate closely with customers and internal teams on cloud-native primitives.
  • Design and build a complete research cluster experience, addressing researchers' challenges.
  • Own the Python SDK for sandbox infrastructure, enabling large-scale RL training runs.
  • Write documentation for OSS training frameworks to assist customers.
  • Engage with infrastructure teams and customers to enhance system designs.

Requirements

  • 8–12+ years of experience in building distributed systems or ML infrastructure.
  • Proven experience with Kubernetes, including custom controllers and workload orchestration.
  • Understanding of researcher productivity and the importance of efficient workflows.
  • Familiarity with distributed job scheduling and large-scale training challenges.
  • Experience in shipping production systems relied upon by users.
  • Strong communication skills to translate customer needs into system designs.

Benefits

  • 100% paid medical, dental, and vision insurance.
  • Company-paid life insurance and short/long-term disability insurance.
  • Flexible Spending Account and Health Savings Account options.
  • Tuition reimbursement and participation in Employee Stock Purchase Program.
  • Mental wellness benefits and family-forming support.
  • Flexible PTO and catered lunch in office locations.

Categories

AI & MLData EngineeringDevOps