Staff Software Engineer - AI Research Infrastructure
Databricksabout 3 hours ago
New York, NY, USA
Staff+
H1B Sponsor
Base Salary
$190k - $270k/yr
Responsibilities
- Design and implement infrastructure for large-scale experiments and model training.
- Build abstractions for job submission, scheduling, and monitoring to expedite research processes.
- Create tooling to improve research developer productivity, including experiment management systems.
- Influence the long-term roadmap for research computation.
- Mentor and support other engineers in compute, infrastructure, and AI systems.
Requirements
- BS/MS or PhD in Computer Science or related field.
- 5+ years of software engineering experience with large-scale distributed systems.
- Deep experience in building and operating distributed systems and data pipelines.
- Proficiency in systems programming languages such as C++, Rust, Go, Java, or Scala.
- Experience with cluster schedulers or large-scale job orchestration systems.
- Understanding of modern ML training and inference workflows.
- Ability to communicate effectively with researchers and engineers.
Tech Stack
Apache SparkC++DatabricksGoJavaKubernetesMLflowRustScala
Categories
AI & MLBackendData Engineering