GrepJob
fal

Staff Software Engineer, ML Performance & Systems

fal
Apply
6 days ago

Base Salary

$180k - $250k/yr

Responsibilities

  • Help fal maintain its frontier position on model performance for generative media models.
  • Design and implement novel approaches to model serving architecture on top of our in-house inference engine.
  • Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities.
  • Work closely with our Applied ML team and customers to ensure their workloads benefit from our accelerator.

Requirements

  • Strong foundation in systems programming with expertise in identifying and fixing bottlenecks.
  • Deep understanding of cutting edge ML infrastructure stack including model compilation, quantization, and serving architectures.
  • Fundamental view of the underlying hardware, particularly Nvidia based systems.
  • Proficient in Triton or willingness to learn with comparable experience in lower-level accelerator programming.
  • Familiarity with multi-dimensional model parallelism techniques.
  • Knowledge of internals of Ring Attention, FA3, FusedMLP implementations.

Benefits

  • Interesting and challenging work.
  • Competitive salary and equity.
  • A lot of learning and growth opportunities.
  • Relocation assistance to San Francisco.
  • Health, dental, and vision insurance (US).
  • Regular team events and offsite.

Tech Stack

PyTorch

Categories

AI & MLData Engineering