Staff Software Engineer - GenAI inference
Databricks
5 months ago
San Francisco, CA, USA
Staff+
H1B Sponsor
Base Salary
$191k - $233k/yr
Responsibilities
- Own and drive the architecture, design, and implementation of the inference engine.
- Collaborate with researchers to integrate new model architectures and features.
- Lead optimization efforts for latency, throughput, memory efficiency, and hardware utilization.
- Define standards for instrumentation, profiling, and tracing tooling.
- Architect scalable routing, batching, scheduling, and memory management mechanisms.
- Ensure reliability and fault tolerance in inference pipelines.
- Collaborate on integrating with distributed inference infrastructure.
- Drive cross-team collaboration with platform engineers and security teams.
- Represent the team through benchmarks, whitepapers, and open-source contributions.
Requirements
- BS/MS/PhD in Computer Science or a related field.
- 6+ years of experience in performance-critical systems.
- Proven track record of owning complex system components.
- Deep understanding of ML inference internals.
- Hands-on experience with CUDA and GPU programming.
- Strong background in distributed systems design.
- Ability to uncover and solve performance bottlenecks.
- Experience building instrumentation and profiling tools for ML models.
- Excellent communication and leadership skills.
- Bonus: published research or open-source contributions in ML systems.
Tech Stack
Apache SparkDatabricksMLflow
Categories
AI & MLBackendData Engineering