Databricks

Staff Software Engineer - GenAI inference

Databricks

Apply
5 months ago
San Francisco, CA, USA
Staff+
H1B Sponsor

Base Salary

$191k - $233k/yr

Responsibilities

  • Own and drive the architecture, design, and implementation of the inference engine.
  • Collaborate with researchers to integrate new model architectures and features.
  • Lead optimization efforts for latency, throughput, memory efficiency, and hardware utilization.
  • Define standards for instrumentation, profiling, and tracing tooling.
  • Architect scalable routing, batching, scheduling, and memory management mechanisms.
  • Ensure reliability and fault tolerance in inference pipelines.
  • Collaborate on integrating with distributed inference infrastructure.
  • Drive cross-team collaboration with platform engineers and security teams.
  • Represent the team through benchmarks, whitepapers, and open-source contributions.

Requirements

  • BS/MS/PhD in Computer Science or a related field.
  • 6+ years of experience in performance-critical systems.
  • Proven track record of owning complex system components.
  • Deep understanding of ML inference internals.
  • Hands-on experience with CUDA and GPU programming.
  • Strong background in distributed systems design.
  • Ability to uncover and solve performance bottlenecks.
  • Experience building instrumentation and profiling tools for ML models.
  • Excellent communication and leadership skills.
  • Bonus: published research or open-source contributions in ML systems.

Tech Stack

Apache SparkDatabricksMLflow

Categories

AI & MLBackendData Engineering