Software Engineer - Model Performance

over 2 years ago

San Francisco, CA, USA or New York, NY, USAEntry Level / Mid Level

H1B Sponsor

Base Salary

$180k - $360k/yr

Responsibilities

Implement and refine techniques for ML model inference and infrastructure.
Debug ML performance issues in underlying codebases like TensorRT and PyTorch.
Apply optimization techniques across various ML models, especially large language models.
Collaborate with a diverse team to design and implement innovative solutions.
Own projects from idea to production.

Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, Mathematics, or related field.
Experience with programming languages such as Python or C++.
Familiarity with LLM optimization techniques like quantization and speculative decoding.
Strong familiarity with ML libraries, especially PyTorch and TensorRT.
Demonstrated interest and experience in large language models.
Deep understanding of GPU architecture.
Bonus: Proficiency in enhancing software performance for LLMs and experience with CUDA.

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents.
Flexible PTO policy including company-wide Winter Break.
Paid parental leave.
Fertility and family-building stipend through Carrot.
Company-facilitated 401(k).
Exposure to a variety of ML startups for learning and networking opportunities.