Nebius

ML/AI Engineer

Nebius

Apply
3 months ago
Amsterdam, Netherlands
Mid Level / Senior

Responsibilities

  • Profile and analyze GPU performance at the system and kernel level.
  • Evaluate and compare GPU performance across different platforms and software stacks.
  • Debug and optimize ML workloads for efficient GPU execution.
  • Conduct acceptance testing for new GPU clusters to ensure performance and compatibility.
  • Perform experiments on diverse GPU configurations to assess performance impacts.
  • Develop tools and dashboards to visualize performance metrics and trends.
  • Contribute to internal tooling, frameworks, and best practices.

Requirements

  • Profound understanding of theoretical foundations of machine learning.
  • Deep understanding of performance aspects of large neural networks.
  • Experience with modern deep learning frameworks like PyTorch and JAX.
  • Good understanding of the GPU stack including CUDA and relevant libraries.
  • Familiarity with containerized environments such as Docker and Kubernetes.
  • Strong communication skills and ability to work independently.

Benefits

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within Nebius.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

Tech Stack

AWSDockerGoogle Cloud PlatformKubernetesPyTorch

Categories

AI & MLData Science