Datadog

AI Research Engineer – Datadog AI Research (DAIR)

Datadog

Apply
6 months ago
New York, NY, USA
Mid Level / Senior
H1B Sponsor

Base Salary

$140k - $400k/yr

Responsibilities

  • Build and operate datasets, training and evaluation pipelines, benchmarks, and internal tooling.
  • Implement models, run experiments at scale, and profile for reliability, performance, and cost.
  • Orchestrate distributed training and distributed reinforcement learning with Ray.
  • Make the research stack observable, reproducible, and easier to use.
  • Establish rigorous automated benchmarks and regression tests for various AI tasks.
  • Collaborate with Research Scientists, Product, and Engineering teams.
  • Contribute high-quality code, documentation, and open-source artifacts.

Requirements

  • Strong software engineering skills with experience in observability, SRE, or security.
  • Depth in distributed computing and ML systems for training and inference at scale.
  • Proficient in Python and familiar with a systems language like Rust, C++, or Go.
  • Practical experience implementing and operating ML training and inference systems.
  • Familiar with efficient training, fine-tuning, and inference techniques for large models.
  • Ability to explain design and performance trade-offs to technical and non-technical audiences.
  • Strong interest in open-science and open-source contributions.

Benefits

  • Competitive global benefits.
  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP).
  • Opportunity to collaborate closely with colleagues across Datadog offices.
  • Opportunity to attend and present at conferences and meetups.
  • Intra-departmental mentor and buddy program for networking.
  • An inclusive company culture with employee resource groups.

Tech Stack

C++GoPythonPyTorchRust

Categories

AI & MLData ScienceDevOps