
Machine Learning Engineer — Multilingual Data
Featherless AI4 months ago
Remote, WorldwideMid Level / Senior
Responsibilities
- Design and maintain large-scale multilingual datasets.
- Develop data pipelines for collection, cleaning, and labeling.
- Implement quality filters using various methods.
- Collaborate with researchers to define language coverage and evaluation metrics.
- Analyze dataset bias and coverage gaps across languages.
- Support training and fine-tuning workflows with multilingual data.
- Iterate on datasets based on model performance.
Requirements
- 3+ years of experience as an ML Engineer or similar role.
- Strong experience with multilingual datasets.
- Solid understanding of NLP fundamentals.
- Experience building scalable data pipelines using Python or similar tools.
- Familiarity with Unicode and language-specific challenges.
- Comfort collaborating with researchers.
Benefits
- Real ownership over a core differentiator of the product.
- Work on models used globally, not just in English-speaking markets.
- Small, high-caliber team with deep ML and systems experience.
- Competitive compensation and meaningful equity at Series A stage.