Software Engineer - Model APIs

7 months ago

San Francisco, CA, USA or New York, NY, USAMid Level / Senior

Base Salary

$180k - $360k/yr

Responsibilities

Design, build, and operate the Model APIs surface with advanced inference capabilities.
Profile and optimize TensorRT-LLM kernels and analyze CUDA kernel performance.
Productionize performance improvements across runtimes with a deep understanding of their internals.
Build comprehensive benchmarking frameworks to measure real-world performance.
Instrument deep observability and build repeatable benchmarks for speed and reliability.
Implement platform fundamentals such as API versioning and authentication.
Collaborate closely with other teams to enhance developer-friendly model serving experiences.

3+ years experience building and operating distributed systems or large-scale APIs.
Proven track record of owning low-latency, reliable backend services.
Strong infra instincts with performance sensibilities like profiling and capacity planning.
Comfortable debugging complex systems from runtime internals to GPU execution traces.
Strong written communication skills for producing clear design docs.

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents.
Flexible PTO policy including a company-wide Winter Break.
Paid parental leave and fertility/family-building stipend.
Company-facilitated 401(k) and exposure to various ML startups.

Kubernetes