AI Research Engineer – Datadog AI Research (DAIR)
Datadog
6 months ago
New York, NY, USA
Mid Level / Senior
H1B Sponsor
Base Salary
$140k - $400k/yr
Responsibilities
- Build and operate datasets, training and evaluation pipelines, benchmarks, and internal tooling.
- Implement models, run experiments at scale, and profile for reliability, performance, and cost.
- Orchestrate distributed training and distributed reinforcement learning with Ray.
- Make the research stack observable, reproducible, and easier to use.
- Establish rigorous automated benchmarks and regression tests for various AI tasks.
- Collaborate with Research Scientists, Product, and Engineering teams.
- Contribute high-quality code, documentation, and open-source artifacts.
Requirements
- Strong software engineering skills with experience in observability, SRE, or security.
- Depth in distributed computing and ML systems for training and inference at scale.
- Proficient in Python and familiar with a systems language like Rust, C++, or Go.
- Practical experience implementing and operating ML training and inference systems.
- Familiar with efficient training, fine-tuning, and inference techniques for large models.
- Ability to explain design and performance trade-offs to technical and non-technical audiences.
- Strong interest in open-science and open-source contributions.
Benefits
- Competitive global benefits.
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP).
- Opportunity to collaborate closely with colleagues across Datadog offices.
- Opportunity to attend and present at conferences and meetups.
- Intra-departmental mentor and buddy program for networking.
- An inclusive company culture with employee resource groups.
Tech Stack
C++GoPythonPyTorchRust
Categories
AI & MLData ScienceDevOps