GrepJob
OKX

Staff AI Engineer, Model Post-Training and Alignment

OKX

Apply
about 3 hours ago
Singapore, Singapore
Staff+

Responsibilities

  • Lead and execute the full post-training pipeline for large language models.
  • Design and implement advanced training paradigms such as DPO and GRPO.
  • Develop domain-specific data recipes and augmentation pipelines.
  • Conduct post-training of specialized small models from scratch.
  • Build and refine Reward Models to support alignment and optimization.
  • Design and implement RLAIF closed-loop systems.
  • Optimize inference efficiency and deploy models using low-latency frameworks.
  • Evaluate model performance using automated benchmarks and feedback loops.
  • Collaborate with research and infrastructure teams for production workflows.

Requirements

  • Bachelor's in Computer Science, AI, Machine Learning, or related fields with at least 8 years of industry experience.
  • Strong hands-on experience across the full post-training pipeline for large models.
  • Deep familiarity with preference learning and alignment techniques.
  • Proven experience designing domain-specific data strategies and training methodologies.
  • Experience training and post-training specialized small models from scratch.
  • Solid understanding of reinforcement learning fundamentals.
  • Experience deploying models in low-latency production environments.

Benefits

  • Competitive total compensation package.
  • L&D programs and education subsidy for employee growth.
  • Various team building programs and company events.
  • Wellness and meal allowances.
  • Comprehensive healthcare schemes for employees and dependants.

Categories

AI & MLData Science