
Machine Learning Performance Engineer
Jane Street
about 1 year ago
London, United Kingdom
Mid Level / Senior
H1B Sponsor
Responsibilities
- Optimize the performance of machine learning models during training and inference.
- Enhance large-scale training efficiency and low-latency inference in real-time systems.
- Utilize a whole-systems approach to improve performance across storage, networking, and GPU considerations.
- Debug and analyze training run performance from end to end.
- Ensure throughput translates to goodput at the lowest system levels.
Requirements
- Experience in low-level systems programming and optimization.
- Understanding of modern machine learning techniques and toolsets.
- Low-level GPU knowledge including PTX, SASS, and memory hierarchy.
- Experience with debugging and optimization tools like CUDA GDB and NSight.
- Familiarity with libraries such as Triton, cuDNN, and cuBLAS.
- Background in networking technologies like Infiniband and NVLink for GPU clusters.
- Fluency in English.
Categories
AI & MLData Engineering