
Machine Learning Engineer - Inference
Together AI4 days ago
Base Salary
$160k - $230k/yr
Responsibilities
- Design and build production systems for the Together AI inference engine.
- Develop and optimize runtime inference services for large-scale AI applications.
- Collaborate with researchers, engineers, product managers, and designers.
- Conduct design and code reviews to ensure high standards of quality.
- Create services, tools, and developer documentation for the inference engine.
- Implement robust and fault-tolerant systems for data ingestion and processing.
Requirements
- 3+ years of experience writing high-performance, well-tested, production-quality code.
- Proficiency with Python and PyTorch.
- Experience in building high performance libraries and tooling.
- Excellent understanding of low-level operating systems concepts.
- Preferred: Knowledge of existing AI inference systems like TGI, vLLM, TensorRT-LLM, Optimum.
- Preferred: Knowledge of AI inference techniques such as speculative decoding.
- Preferred: Knowledge of CUDA/Triton programming.
- Nice to have: Knowledge of Rust, Cython, and compilers.
Benefits
- Competitive compensation and startup equity.
- Health insurance and other competitive benefits.