
Machine Learning, Platform Engineer
Together AI4 days ago
Base Salary
$160k - $250k/yr
Responsibilities
- Work on multi-cluster orchestration, portfolio optimization, and predictive autoscaling.
- Analyze and improve the robustness and scalability of distributed systems and infrastructure.
- Collaborate with product teams to understand functional requirements.
- Write clear, well-tested, and maintainable software and IaC.
- Conduct design and code reviews and develop testing strategies.
Requirements
- 5+ years of experience in building large scale, fault tolerant, distributed systems.
- Experience with serverless inference platforms and cloud providers is a plus.
- Ability to discuss failures and improvements in built systems.
- Experience in designing and improving system efficiency and stability.
- Excellent understanding of operating systems concepts including concurrency and networking.
- Expert-level programming skills in Python, Golang, Rust, C++, or Haskell.
- Proficiency in Infrastructure as Code (IaC) using tools like Terraform.
- Experience with Kubernetes or other container orchestration systems.
- Sound judgement on the use of LLMs for code.
- Bachelor’s or Master’s degree in a related technical field or equivalent experience.
Benefits
- Competitive compensation and startup equity.
- Health insurance and other competitive benefits.