GrepJob
Relace

Machine Learning Engineer

Relace
Apply
6 months ago
San Francisco, CA, USAMid Level / Senior

Responsibilities

  • Optimize models for speed and efficiency through low-level engineering.
  • Collaborate with research teams to productionize new architectures.
  • Enhance performance of training and inference workloads.
  • Work on CUDA kernels and GPU scheduling for improved performance.
  • Manage memory layouts and parallelization for large-scale ML systems.

Requirements

  • Strong background in systems-level ML engineering.
  • Experience with CUDA and GPU kernel optimization.
  • Fluency in Python and at least one systems language (C++ or Rust preferred).
  • Familiarity with distributed training frameworks like PyTorch or JAX.
  • Experience with large-scale training or inference infrastructure.
  • Understanding of memory management and hardware-aware model optimization.
  • 2+ years of experience in ML infrastructure or performance-critical environments.
  • Willingness to work in-person from the SF office in FiDi.

Tech Stack

C++PythonPyTorchRust

Categories

AI & MLData Engineering