Staff Software Engineer, Machine Learning Inference Platform

about 2 months ago

Remote, Worldwide or Pittsburgh, PA, USAStaff+

H1B Sponsor

Responsibilities

Design platform architecture for multi-tenant inference workloads.
Develop robust API layers and developer SDKs for distributed inference orchestration.
Build and harden a multi-tenant control plane for accurate metering and tenant isolation.
Optimize inference performance across the entire system stack.
Build observability and SLOs for insights into system economics and performance.
Partner with product and infrastructure teams on model onboarding and capacity planning.
Promote a culture of engineering excellence within the team.

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
7+ years of experience building and operating backend distributed systems.
Demonstrated cross-team technical leadership in backend distributed systems or ML infrastructure.
Strong fundamentals in data-intensive distributed systems and performance profiling.
Hands-on experience with large-scale inference services on GPUs.
Direct experience with inference engines or serving frameworks.
Strong programming skills in C++, Go, Rust, or Python.
Familiarity with deep learning frameworks and GPU computing primitives.
Excellent verbal and written communication skills.
Experience with autonomous vehicles is a bonus.