Inference Engineer

over 1 year ago

San Francisco, CA, USAMid Level / Senior

H1B Sponsor

Base Salary

$180k - $250k/yr

Responsibilities

Design and build low latency, scalable, and reliable model inference and serving stack.
Collaborate with research and product engineers to deliver products efficiently.
Create robust inference infrastructure and monitoring systems.
Shape product applications and impact AI deployment across devices.

Requirements

Strong engineering skills with experience in complex codebases.
Experience building large-scale distributed systems focused on performance and reliability.
Technical leadership with a track record of delivering results in ambiguous situations.
Background in inference pipelines with machine learning and generative models.
Experience implementing state-of-the-art machine learning models.
Preferable experience with vLLM, SGLang, or other inference frameworks.
Preferable experience with CUDA, Triton, or similar technologies.

Benefits

Competitive base salary with an attractive equity package.
Monthly commuter allowance for travel to the office.
Flexible PTO to recharge as needed.
Daily meals and snacks provided in the office.
Unique perks like your own personal Yoshi.

Categories

AI & MLData Science