GrepJob
Coupa Software

Senior AI Engineer, NLP & Training Data - 11316

Coupa Software
Apply
about 20 hours ago
Bengaluru, India
Senior
H1B Sponsor

Responsibilities

  • Design and implement training data generation pipelines, including synthetic data generation.
  • Build data labeling and annotation workflows with quality validation loops.
  • Convert enterprise data into formats suitable for model training.
  • Implement active learning strategies to identify high-value training examples.
  • Collaborate with domain experts to validate training data quality and relevance.
  • Build automated data quality checks: coverage, balance, consistency.
  • Design training data versioning and lineage tracking.
  • Analyze model evaluation results to identify training data gaps.

Requirements

  • 5+ years of software engineering experience, with 2+ years in NLP, data science, or ML data engineering.
  • Experience with text processing, tokenization, and NLP pipelines.
  • Hands-on experience with data labeling tools and annotation workflows.
  • Experience generating synthetic training data using language model APIs.
  • Understanding of instruction-tuning and training data quality metrics.
  • Proficiency in Python (pandas, PySpark).
  • Experience with data versioning tools is a plus.
  • BS/MS in Computer Science, NLP, or equivalent experience.

Tech Stack

PandasPython

Categories

AI & MLData EngineeringData Science