GrepJob
Together AI

Machine Learning Engineer - Inference

Together AI
Apply
4 days ago
San Francisco, CA, USAMid Level / Senior
H1B Sponsor

Base Salary

$160k - $230k/yr

Responsibilities

  • Design and build production systems for the Together AI inference engine.
  • Develop and optimize runtime inference services for large-scale AI applications.
  • Collaborate with researchers, engineers, product managers, and designers.
  • Conduct design and code reviews to ensure high standards of quality.
  • Create services, tools, and developer documentation for the inference engine.
  • Implement robust and fault-tolerant systems for data ingestion and processing.

Requirements

  • 3+ years of experience writing high-performance, well-tested, production-quality code.
  • Proficiency with Python and PyTorch.
  • Experience in building high performance libraries and tooling.
  • Excellent understanding of low-level operating systems concepts.
  • Preferred: Knowledge of existing AI inference systems like TGI, vLLM, TensorRT-LLM, Optimum.
  • Preferred: Knowledge of AI inference techniques such as speculative decoding.
  • Preferred: Knowledge of CUDA/Triton programming.
  • Nice to have: Knowledge of Rust, Cython, and compilers.

Benefits

  • Competitive compensation and startup equity.
  • Health insurance and other competitive benefits.

Tech Stack

PythonPyTorchRust

Categories

AI & MLData Engineering