GrepJob
Baseten

Software Engineer - Model Performance

Baseten
Apply
about 2 years ago
San Francisco, CA, USA or New York, NY, USAEntry Level / Mid Level

Base Salary

$180k - $360k/yr

Responsibilities

  • Implement and refine techniques for ML model inference and infrastructure.
  • Debug ML performance issues in underlying codebases like TensorRT and PyTorch.
  • Apply optimization techniques across various ML models, especially large language models.
  • Collaborate with a diverse team to design and implement innovative solutions.
  • Own projects from idea to production.

Requirements

  • Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, Mathematics, or related field.
  • Experience with programming languages such as Python or C++.
  • Familiarity with LLM optimization techniques like quantization and speculative decoding.
  • Strong familiarity with ML libraries, especially PyTorch and TensorRT.
  • Demonstrated interest and experience in large language models.
  • Deep understanding of GPU architecture.
  • Bonus: Proficiency in enhancing software performance for LLMs and experience with CUDA.

Benefits

  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents.
  • Flexible PTO policy including company-wide Winter Break.
  • Paid parental leave.
  • Fertility and family-building stipend through Carrot.
  • Company-facilitated 401(k).
  • Exposure to a variety of ML startups for learning and networking opportunities.

Tech Stack

C++DockerKubernetesPythonPyTorch

Categories

AI & MLBackendData Engineering