GrepJob
Embedding VC

Member of Technical Staff - Efficient ML

Embedding VC
Apply
4 months ago
San Francisco, CA, USAMid Level / Senior

Responsibilities

  • Enhance training efficiency through dataloaders and gradient checkpointing.
  • Optimize GPU performance using Nsight profiling and CUDA kernels.
  • Implement low-latency serving and continuous batching for inference.
  • Manage multi-node jobs with SLURM and Kubernetes.
  • Ensure reliability and determinism in the machine learning infrastructure.

Requirements

  • Experience with machine learning frameworks and optimization techniques.
  • Proficiency in GPU programming and performance profiling tools.
  • Familiarity with SLURM and Kubernetes for job management.
  • Knowledge of quantization, distillation, and pruning methods.
  • Strong problem-solving skills and ability to work in a team environment.

Tech Stack

Kubernetes

Categories

AI & MLData Engineering