GrepJob
Crusoe

Senior Production Engineer

Crusoe
Apply
about 3 hours ago
San Francisco, CA, USA or Sunnyvale, CA, USASenior
H1B Sponsor

Base Salary

$209k - $253k/yr

Responsibilities

  • Design and operate reliable managed AI services focused on LLM workloads.
  • Build automation and reliability tooling for distributed AI pipelines.
  • Define, measure, and improve SLIs/SLOs for AI workloads.
  • Collaborate with teams to optimize large-scale training and inference clusters.
  • Automate observability and performance tuning for latency-sensitive AI services.
  • Investigate and resolve reliability issues in distributed AI systems.
  • Contribute to the architecture of next-generation distributed systems.

Requirements

  • Strong software engineering background with experience in production-grade systems.
  • Demonstrated experience in distributed systems design and implementation.
  • Hands-on work with large language models or AI/ML infrastructure.
  • Experience with defining and measuring SLIs/SLOs and building monitoring systems.
  • Proficiency in at least one modern programming language (Python, Go, Java, C++).
  • Familiarity with Kubernetes or container orchestration platforms.
  • Strong collaboration and communication skills.

Benefits

  • Industry competitive pay.
  • Restricted Stock Units in a fast-growing technology company.
  • Health insurance options including HDHP and PPO, vision, and dental.
  • Employer contributions to HSA accounts.
  • Paid Parental Leave.
  • Paid life insurance, short-term and long-term disability.
  • 401(k) with a 100% match up to 4% of salary.
  • Generous paid time off and holiday schedule.
  • Tuition reimbursement.
  • Company paid commuter benefit of $300 per month.