Software Engineer - Gen AI inference

8 months ago

H1B Sponsor

Base Salary

$142k - $205k/yr

Responsibilities

Contribute to the design and implementation of the inference engine for large-scale LLMs.
Collaborate with researchers to integrate new model architectures and features.
Optimize latency, throughput, memory efficiency, and hardware utilization.
Build and maintain profiling and tracing tools to identify bottlenecks.
Develop scalable routing, batching, scheduling, and memory management mechanisms.
Support reliability and fault tolerance in inference pipelines.
Integrate with distributed inference infrastructure and manage load balancing.
Document and share learnings to contribute to best practices.

Apache SparkDatabricksMLflow