GrepJob
Graphcore

AI Performance Engineer

Graphcore
Apply
5 days ago
Milpitas, CA, USAMid Level / Senior
H1B Sponsor

Responsibilities

  • Analyze ML models’ compute and memory requirements using roofline analysis and simulations.
  • Collaborate across hardware and software teams to optimize large-scale AI workloads.
  • Benchmark, monitor, and troubleshoot system performance across distributed systems.
  • Optimize communication stacks including MPI, NCCL, UCX, RDMA, and networking fabrics.
  • Profile and optimize AI workloads, focusing on performance bottlenecks.
  • Develop high-quality, ARM-compatible code and documentation.

Requirements

  • BS/MS in Computer Science, Electrical Engineering, or related field.
  • Experience with distributed systems and communication libraries (MPI, NCCL, UCX, libfabric).
  • Strong programming skills in C++ and Python.
  • Experience profiling and optimizing HPC or AI/ML workloads.
  • Familiarity with ML benchmarks such as MLPerf.
  • Desirable: Experience with GPUs or accelerated computing architectures.
  • Desirable: Knowledge of HPC networking and interconnect technologies (InfiniBand, RoCE).
  • Desirable: Familiarity with ML frameworks such as PyTorch or TensorFlow.
  • Desirable: Understanding of ARM architectures and toolchains.
  • Desirable: Strong debugging, profiling, and performance optimization skills.

Tech Stack

C++PythonPyTorchTensorFlow

Categories

AI & MLData EngineeringDevOps