GrepJob
Together AI

Machine Learning, Platform Engineer

Together AI
Apply
4 days ago
San Francisco, CA, USASenior / Staff+
H1B Sponsor

Base Salary

$160k - $250k/yr

Responsibilities

  • Work on multi-cluster orchestration, portfolio optimization, and predictive autoscaling.
  • Analyze and improve the robustness and scalability of distributed systems and infrastructure.
  • Collaborate with product teams to understand functional requirements.
  • Write clear, well-tested, and maintainable software and IaC.
  • Conduct design and code reviews and develop testing strategies.

Requirements

  • 5+ years of experience in building large scale, fault tolerant, distributed systems.
  • Experience with serverless inference platforms and cloud providers is a plus.
  • Ability to discuss failures and improvements in built systems.
  • Experience in designing and improving system efficiency and stability.
  • Excellent understanding of operating systems concepts including concurrency and networking.
  • Expert-level programming skills in Python, Golang, Rust, C++, or Haskell.
  • Proficiency in Infrastructure as Code (IaC) using tools like Terraform.
  • Experience with Kubernetes or other container orchestration systems.
  • Sound judgement on the use of LLMs for code.
  • Bachelor’s or Master’s degree in a related technical field or equivalent experience.

Benefits

  • Competitive compensation and startup equity.
  • Health insurance and other competitive benefits.

Tech Stack

C++GoHaskellKubernetesPythonPyTorchRustTerraform

Categories

AI & MLData EngineeringDevOps