over 1 year ago
Base Salary
$180k - $250k/yr
Responsibilities
- Design and build low latency, scalable, and reliable model inference and serving stack.
- Collaborate with research and product engineers to deliver products efficiently.
- Create robust inference infrastructure and monitoring systems.
- Shape product applications and impact AI deployment across devices.
Requirements
- Strong engineering skills with experience in complex codebases.
- Experience building large-scale distributed systems focused on performance and reliability.
- Technical leadership with a track record of delivering results in ambiguous situations.
- Background in inference pipelines with machine learning and generative models.
- Experience implementing state-of-the-art machine learning models.
- Preferable experience with vLLM, SGLang, or other inference frameworks.
- Preferable experience with CUDA, Triton, or similar technologies.
Benefits
- Competitive base salary with an attractive equity package.
- Monthly commuter allowance for travel to the office.
- Flexible PTO to recharge as needed.
- Daily meals and snacks provided in the office.
- Unique perks like your own personal Yoshi.
Categories
AI & MLData Science
