GrepJob
Together AI

Senior Backend Engineer, Inference Platform

Together AI
Apply
4 days ago

Base Salary

$160k - $250k/yr

Responsibilities

  • Build and optimize global and local request routing for low-latency load balancing.
  • Develop auto-scaling systems to dynamically allocate resources across data centers.
  • Design systems for multi-tenant traffic shaping and resource allocation.
  • Engineer trade-offs between latency and throughput for diverse workloads.
  • Optimize prefix caching to enhance model compute efficiency.
  • Collaborate with ML researchers to scale new model architectures.
  • Continuously profile and analyze system performance to identify bottlenecks.

Requirements

  • 5+ years of experience in building large-scale, fault-tolerant distributed systems.
  • Strong background in designing and improving complex systems for efficiency and scalability.
  • Excellent understanding of low-level OS concepts like multi-threading and memory management.
  • Expert-level programming skills in Rust, Go, Python, or TypeScript.
  • Knowledge of modern LLMs and generative models is a plus.
  • Experience with the open source ecosystem around inference is highly valuable.
  • Familiarity with Kubernetes or container orchestration is a strong plus.
  • Knowledge of GPU software stacks and HPC technologies is a plus.
  • Bachelor’s or Master’s degree in Computer Science or related field, or equivalent experience.

Benefits

  • Competitive compensation and equity.
  • Health insurance and other competitive benefits.

Tech Stack

GoKubernetesPythonRustTypeScript

Categories