9 months ago
Toronto, Canada +4 moreMid Level / Senior
H1B Sponsor
Responsibilities
- Design and write high-performing and scalable software for training models.
- Develop new tools to support and accelerate research and LLM training.
- Coordinate with engineering and scientific teams to create an integrated post-training ecosystem.
- Implement techniques to improve performance and speed up training cycles.
- Research, implement, and experiment with ideas on cluster and data infrastructure.
- Collaborate with scientists, engineers, and teams.
Requirements
- Extremely strong software engineering skills.
- Value test-driven development methods and clean code.
- Proficiency in Python and related ML frameworks such as JAX and Pytorch.
- Experience with large-scale distributed training strategies.
- Bonus: Experience with distributed training infrastructures like Kubernetes.
- Bonus: Hands-on experience with the post-training phase of model training.
- Bonus: Experience in ML, LLM, and RL academic research.
Benefits
- An open and inclusive culture and work environment.
- Work closely with a team on the cutting edge of AI research.
- Weekly lunch stipend, in-office lunches & snacks.
- Full health and dental benefits, including a budget for mental health.
- 100% Parental Leave top-up for up to 6 months.
- Personal enrichment benefits for arts, culture, fitness, and workspace improvement.
- Remote-flexible work options and co-working stipend.
- 6 weeks of vacation (30 working days).
