about 2 years ago
San Francisco, CA, USA or New York, NY, USAEntry Level / Mid Level
Base Salary
$180k - $360k/yr
Responsibilities
- Implement and refine techniques for ML model inference and infrastructure.
- Debug ML performance issues in underlying codebases like TensorRT and PyTorch.
- Apply optimization techniques across various ML models, especially large language models.
- Collaborate with a diverse team to design and implement innovative solutions.
- Own projects from idea to production.
Requirements
- Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, Mathematics, or related field.
- Experience with programming languages such as Python or C++.
- Familiarity with LLM optimization techniques like quantization and speculative decoding.
- Strong familiarity with ML libraries, especially PyTorch and TensorRT.
- Demonstrated interest and experience in large language models.
- Deep understanding of GPU architecture.
- Bonus: Proficiency in enhancing software performance for LLMs and experience with CUDA.
Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents.
- Flexible PTO policy including company-wide Winter Break.
- Paid parental leave.
- Fertility and family-building stipend through Carrot.
- Company-facilitated 401(k).
- Exposure to a variety of ML startups for learning and networking opportunities.
