Staff Software Engineer - GenAI Performance and Kernel
Databricks
5 months ago
San Francisco, CA, USA
Staff+
H1B Sponsor
Base Salary
$191k - $233k/yr
Responsibilities
- Lead the design and implementation of core compute kernels optimized for various hardware backends.
- Drive the performance roadmap for kernel-level improvements including vectorization and auto-tuning.
- Integrate kernel optimizations with higher-level ML systems.
- Build and maintain profiling and verification tools to detect performance regressions.
- Conduct performance investigations and root-cause analysis on inference bottlenecks.
- Establish coding patterns and frameworks for kernel modularization and maintainability.
- Influence system architecture decisions for effective kernel improvements.
- Mentor and guide engineers in performance best practices.
- Collaborate with teams to roll out optimizations into production and monitor their impact.
Requirements
- BS/MS/PhD in Computer Science or a related field.
- Deep experience writing and tuning compute kernels for ML workloads.
- Strong knowledge of GPU/accelerator architecture and memory hierarchy.
- Experience with advanced optimization techniques like tiling and vectorization.
- Familiarity with ML-specific kernel libraries and open kernels.
- Strong debugging and profiling skills using tools like Nsight and NVProf.
- Experience with numerical stability, mixed precision, and quantization.
- Experience integrating optimized kernels into real-world ML inference systems.
- Proven track record of shipping performance-critical software.
- Excellent communication and leadership skills.
Categories
AI & MLData Engineering