GrepJob
Cantina

Machine Learning Engineer, Core Data

Cantina
Apply
9 days ago
Remote, Worldwide +2 moreMid Level / Senior
H1B Sponsor

Responsibilities

  • Define specifications and audit large-scale audio/text datasets.
  • Build automated quality metrics and validation dashboards.
  • Train models to tag, score, and filter data effectively.
  • Apply data cleaning techniques to maintain dataset integrity.
  • Optimize data selection through sampling and active learning.
  • Integrate quality gates into training and evaluation pipelines.

Requirements

  • Strong experience in building ML-driven data quality systems for audio/speech.
  • Proficient in Python and PyTorch, with experience in training SSL-ASR models.
  • Familiarity with audio/speech fundamentals and relevant libraries.
  • Scalable data engineering skills with tools like Spark and SQL.
  • Experience with ASR/TTS metrics and dataset validation.
  • Ability to translate ambiguous requirements into measurable improvements.

Tech Stack

Apache AirflowApache BeamApache SparkAWSGoogle Cloud PlatformPythonPyTorchSQL

Categories

AI & MLData EngineeringData Science