5 months ago
Remote, Worldwide +2 moreMid Level / Senior
H1B Sponsor
Responsibilities
- Act as the technical owner for enterprise customer post-training engagements involving text workloads.
- Translate customer requirements into concrete post-training specifications and workflows.
- Design and execute data generation, filtering, and quality assessment processes for text corpora.
- Run supervised fine-tuning, instruction tuning, RLHF, DPO, and other preference alignment workflows.
- Design task-specific evaluations for text model performance and interpret results.
- Build reusable applied tooling and workflows that accelerate future customer engagements.
Requirements
- Hands-on experience with data generation and evaluation for LLM post-training.
- Experience training or fine-tuning models using SFT, instruction tuning, RLHF, DPO, or similar methods.
- Strong intuition for text data quality and evaluation design.
- Experience with text-specific post-training workflows like chat model alignment and instruction tuning.
- Proficiency with open-source ML ecosystem tools such as Hugging Face and PyTorch.
Benefits
- Competitive base salary with equity in a unicorn-stage company.
- 100% coverage of medical, dental, and vision premiums for employees and dependents.
- 401(k) matching up to 4% of base pay.
- Unlimited PTO plus company-wide Refill Days throughout the year.
