GrepJob
Cohere

Member of Technical Staff, Synthetic Data

Cohere
Apply
5 months ago
Toronto, Canada +4 moreMid Level / Senior
H1B Sponsor

Responsibilities

  • Design and build scalable inference pipelines that run on large GPU clusters.
  • Conduct data ablations to assess data quality and experiment with data mixtures to enhance model performance.
  • Research and implement innovative synthetic data curation methods.
  • Collaborate with cross-functional teams to ensure data pipelines meet the demands of language models.

Requirements

  • Strong software engineering skills with proficiency in Python.
  • Experience building data pipelines and familiarity with data processing frameworks like Apache Spark or Pandas.
  • Experience working with large-scale datasets, including web and code data.
  • Familiarity with LLM inference frameworks such as vLLM and TensorRT.
  • A passion for bridging research and engineering in AI model training.

Benefits

  • An open and inclusive culture and work environment.
  • Weekly lunch stipend, in-office lunches, and snacks.
  • Full health and dental benefits, including a budget for mental health.
  • 100% Parental Leave top-up for up to 6 months.
  • Personal enrichment benefits for arts, culture, fitness, and workspace improvement.
  • Remote-flexible work options with offices in major cities.
  • 6 weeks of vacation (30 working days).

Tech Stack

Apache BeamApache SparkPandasPython

Categories

AI & MLBackendData Science