GrepJob
Anthropic

Staff+ Software Engineer, Inference Runtime

Anthropic
Apply
about 3 hours ago
Remote, Worldwide +3 moreStaff+
H1B Sponsor

Base Salary

$405k - $485k/yr

Responsibilities

  • Set technical direction for the team, owning the architecture and roadmap for the shared runtime of the inference serving stack.
  • Own and evolve the accelerator-agnostic runtime, including hands-on work in a performance-sensitive Rust and Python codebase.
  • Ensure new models and deployment targets pay only for their own specialization, keeping expansion costs low.
  • Drive efficient accelerator usage across GPU, TPU, and Trainium.
  • Build the runtime's validation surface around partitioned builds and change-scoped testing.
  • Act as a technical counterpart to the central Infrastructure org on compilers and build systems.
  • Mentor engineers through design and code reviews, raising the technical bar.

Requirements

  • Deep background in systems engineering or ML infrastructure with hands-on experience in performance profiling and optimization.
  • Real depth in at least one accelerator ecosystem (CUDA/GPU, TPU, or Trainium/AWS Neuron).
  • Significant software engineering experience in high-performance, large-scale distributed systems.
  • Track record of defining and using engineering metrics to drive improvement.
  • Experience driving technical alignment across organizational boundaries.
  • Strong written and verbal communication skills.

Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Collaborative office space.