Research Engineer, Infrastructure, Inference

7 days ago

H1B Sponsor

Base Salary

$350k - $475k/yr

Responsibilities

Work alongside researchers and engineers to bring cutting-edge AI models into production.
Collaborate with research teams to enable high-performance inference for novel architectures.
Design and implement new techniques, tools, and architectures that improve performance, latency, throughput, and efficiency.
Optimize the codebase and compute fleet to fully utilize hardware resources.
Extend orchestration frameworks for distributed inference and large-batch serving.
Establish standards for reliability, observability, and reproducibility across the inference stack.
Publish and share learnings through documentation, open-source libraries, or technical reports.

Bachelor’s degree or equivalent experience in computer science, engineering, or similar.
Understanding of deep learning frameworks and their underlying system architectures.
Experience with inference serving systems optimized for throughput and latency.
Ability to thrive in a highly collaborative environment with cross-functional partners.
A bias for action and initiative to work across different stacks and teams.
Strong engineering skills to contribute performant, maintainable code and debug complex codebases.

KubernetesPyTorch