GrepJob
Magic

Member of Technical Staff, Inference & RL Systems

Magic
Apply
4 months ago
San Francisco, CA, USAMid Level / Senior
H1B Sponsor

Base Salary

$225k - $550k/yr

Responsibilities

  • Design and scale high-performance inference serving systems.
  • Optimize KV-cache management, batching strategies, and scheduling.
  • Improve throughput and latency for long-context workloads.
  • Build and maintain distributed RL and post-training infrastructure.
  • Improve reliability of rollout, evaluation, and reward pipelines.
  • Automate fault detection and recovery for serving and RL systems.
  • Profile and eliminate performance bottlenecks across GPU, networking, and storage layers.
  • Collaborate with Kernels and Research to align execution systems with model architecture.

Requirements

  • Strong software engineering and distributed systems fundamentals.
  • Experience building or operating large-scale inference or training systems.
  • Deep understanding of GPU execution constraints and memory trade-offs.
  • Experience debugging performance issues in production ML systems.
  • Ability to reason about system-level trade-offs between latency, throughput, and cost.
  • Track record of owning critical production infrastructure.

Benefits

  • Annual salary range: $225K - $550K.
  • Equity is a significant part of total compensation, in addition to salary.
  • 401(k) plan with 6% salary matching.
  • Generous health, dental and vision insurance for you and your dependents.
  • Unlimited paid time off.
  • Visa sponsorship and relocation stipend to bring you to SF, if possible.

Categories