GrepJob
Cohere

Member of Technical Staff, Training Performance Engineer

Cohere
Apply
about 1 year ago
Toronto, Canada +4 moreMid Level / Senior
H1B Sponsor

Responsibilities

  • Design and write high-performant and scalable software for training.
  • Understand architectural modifications and their effects on training throughput and quality.
  • Write low-level CUDA and triton kernels to optimize accelerator performance.
  • Research and implement ideas on supercompute and data infrastructure.
  • Collaborate with leading researchers in the field.

Requirements

  • Extremely strong software engineering skills.
  • Proficiency in Python and ML frameworks such as JAX, Pytorch, and XLA/MLIR.
  • Experience writing kernels for GPUs using CUDA and triton.
  • Experience with large-scale distributed training strategies.
  • Familiarity with autoregressive sequence models like Transformers.
  • Bonus: published papers at top-tier venues.

Benefits

  • An open and inclusive culture and work environment.
  • Weekly lunch stipend, in-office lunches, and snacks.
  • Full health and dental benefits, including mental health support.
  • 100% Parental Leave top-up for up to 6 months.
  • Personal enrichment benefits for arts, culture, fitness, and workspace improvement.
  • Remote-flexible work options and co-working stipend.
  • 6 weeks of vacation (30 working days).

Tech Stack

PythonPyTorch

Categories

AI & MLBackendData Engineering