AI Research Engineer - Datadog AI Research (DAIR)
Datadog
6 months ago
Paris, France
Mid Level / Senior
H1B Sponsor
Responsibilities
- Build and operate datasets, training and evaluation pipelines, benchmarks, and internal tooling.
- Implement models, run experiments at scale, and profile for reliability, performance, and cost.
- Orchestrate distributed training and reinforcement learning with Ray, including scheduling, scaling, and failure recovery.
- Make the research stack observable, reproducible, and easier to use.
- Establish rigorous automated benchmarks and regression tests for various AI tasks.
- Collaborate with Research Scientists, Product, and Engineering to integrate AI capabilities into Datadog’s products.
- Contribute high-quality code, documentation, and open-source artifacts.
Requirements
- Strong software engineering skills with experience in observability, SRE, or security.
- Depth in distributed computing and ML systems for training and inference at scale.
- Proficient in Python and familiar with a systems language like Rust, C++, or Go.
- Practical experience implementing and operating ML training and inference systems.
- Familiarity with efficient training, fine-tuning, and inference techniques for large foundation models.
- Ability to explain design and performance trade-offs to technical and non-technical audiences.
- Strong interest in open-science and open-source contributions.
Benefits
- Competitive global benefits.
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP).
- Opportunity to collaborate closely with colleagues across Datadog offices in New York City and Paris.
- Opportunity to attend and present at conferences and meetups.
- Intra-departmental mentor and buddy program for networking.
- An inclusive company culture with employee resource groups.
Tech Stack
C++GoPythonPyTorchRust
Categories
AI & MLBackendData Science