10 months ago
San Francisco, CA, USA or New York, NY, USAMid Level / Senior
Base Salary
$180k - $360k/yr
Responsibilities
- Design and implement high-performance GPU kernels for ML operations.
- Write and optimize code using CUDA and architecture-specific techniques.
- Apply advanced performance optimization methods.
- Implement cutting-edge features like quantization and compute/communication overlap.
- Identify and resolve performance bottlenecks using profiling tools.
- Collaborate with research teams to productionize advancements.
- Contribute to internal and open-source GPU libraries.
- Present technical contributions at industry conferences.
Requirements
- Strong understanding of GPU architecture and programming paradigms.
- Proficient in C++ and GPU performance profiling tools.
- Knowledge of CUDA C++ API and memory access patterns.
- Familiarity with numerical precision and quantization strategies.
- Understanding of modern GPU features like tensor cores.
Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employees and dependents.
- Flexible PTO policy including a company-wide Winter Break.
- Paid parental leave.
- Fertility and family-building stipend through Carrot.
- Company-facilitated 401(k).
- Exposure to a variety of ML startups for learning and networking.
Tech Stack
C++
Categories
AI & MLData Engineering
