about 3 hours ago
Palo Alto, CA, USASenior
Base Salary
$185k - $300k/yr
Responsibilities
- Lead and implement advanced inference acceleration techniques.
- Engineer and optimize GPU strategies for efficiency and scalability.
- Develop and optimize high-performance computing kernels using CUDA and NCCL.
- Collaborate with teams to deploy video generation and language models.
- Contribute to improvements in model training speed and resource utilization.
- Drive code reviews and mentor engineers on best practices.
Requirements
- 5+ years of engineering experience in inference acceleration and model deployment.
- Proven expertise in inference optimization techniques.
- Deep knowledge of GPU programming and parallelism strategies.
- Familiarity with video generation models and large language models.
- Strong cross-discipline communication skills.
- Self-driven with an ownership mindset.
Benefits
- Competitive salary in the AI industry.
- Equity in a fast-growing startup.
- Comprehensive health benefits and monthly stipends.
- Company retreats and a collaborative office culture.
