GrepJob
Anyscale

Distributed LLM Inference Engineer

Anyscale
Apply
4 days ago
Palo Alto, CA, USA or San Francisco, CA, USAMid Level / Senior
H1B Sponsor

Base Salary

$170k - $247k/yr

Responsibilities

  • Collaborate with product teams to deliver end-to-end solutions for batch and online inference.
  • Integrate Ray Data and LLM engines to optimize large-scale ML inference.
  • Work with open-source software like vLLM and contribute improvements to the community.
  • Stay updated on state-of-the-art practices in the open-source and research communities.

Requirements

  • Familiarity with running ML inference at large scale with high throughput and low latency.
  • Experience with deep learning frameworks such as PyTorch.
  • Solid understanding of distributed systems and ML inference challenges.
  • Bonus points for knowledge of ML systems and experience using Ray.
  • Experience with community engagement on LLM engines like vLLM and contributions to deep learning frameworks.

Benefits

  • Stock options.
  • Healthcare plans with 99% premium coverage for employees and dependents.
  • 401k retirement plan.
  • Education and wellbeing stipend.
  • Paid parental leave.
  • Fertility benefits.
  • Paid time off.
  • Commute reimbursement.
  • 100% of in-office meals covered.

Tech Stack

PyTorchTensorFlow

Categories

AI & MLData Engineering