GrepJob
Featherless AI

Machine Learning Engineer — Multilingual Data

Featherless AI
Apply
4 months ago
Remote, WorldwideMid Level / Senior

Responsibilities

  • Design and maintain large-scale multilingual datasets.
  • Develop data pipelines for collection, cleaning, and labeling.
  • Implement quality filters using various methods.
  • Collaborate with researchers to define language coverage and evaluation metrics.
  • Analyze dataset bias and coverage gaps across languages.
  • Support training and fine-tuning workflows with multilingual data.
  • Iterate on datasets based on model performance.

Requirements

  • 3+ years of experience as an ML Engineer or similar role.
  • Strong experience with multilingual datasets.
  • Solid understanding of NLP fundamentals.
  • Experience building scalable data pipelines using Python or similar tools.
  • Familiarity with Unicode and language-specific challenges.
  • Comfort collaborating with researchers.

Benefits

  • Real ownership over a core differentiator of the product.
  • Work on models used globally, not just in English-speaking markets.
  • Small, high-caliber team with deep ML and systems experience.
  • Competitive compensation and meaningful equity at Series A stage.

Tech Stack

Apache SparkPython

Categories

AI & MLData EngineeringData Science