
Machine Learning Performance Engineer
Jane Street
about 1 year ago
New York, NY, USA
Mid Level / Senior
H1B Sponsor
Responsibilities
- Optimize the performance of machine learning models during training and inference.
- Enhance large-scale training efficiency and low-latency inference in real-time systems.
- Utilize a whole-systems approach to improve performance across storage, networking, and GPU considerations.
- Debug and optimize training runs' performance end to end.
- Ensure the platform's performance metrics are accurate and meaningful.
Requirements
- Experience in low-level systems programming and optimization.
- Understanding of modern machine learning techniques and toolsets.
- Low-level GPU knowledge including PTX, SASS, and Tensor Cores.
- Experience with debugging and optimization tools like CUDA GDB and NSight.
- Familiarity with libraries such as Triton, cuDNN, and cuBLAS.
- Background in networking technologies like Infiniband and NVLink.
- Fluency in English.
Categories
AI & MLData Engineering