Staff Software Engineer, ML Acceleration

13 days ago

Remote, Worldwide or Pittsburgh, PA, USAStaff+

H1B Sponsor

Responsibilities

Analyze ML models to identify and resolve performance bottlenecks.
Incorporate OSS tools to enable ML engineers to self-sufficiently profile and optimize models.
Deliver solutions to streamline model deployment across various hardware platforms.
Collaborate with ML researchers to balance model accuracy and speed.
Implement optimizations using CUDA, Triton, and custom kernels.
Promote engineering excellence within the team.

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
5+ years of experience, including GPU programming and optimization.
Strong programming skills in C++ and Python.
Proven experience in GPU programming and optimization.
Familiarity with deep learning frameworks, especially PyTorch.
Experience with CUDA programming and Triton language for GPU kernels.
Knowledge of PyTorch optimization techniques and TensorRT implementation.
Experience with ONNX model conversion and deployment.
Deep understanding of GPU architectures and performance optimization.
Strong analytical and problem-solving skills.
Excellent verbal and written communication skills.
Experience with autonomous vehicles (AV) is a bonus.

C++PythonPyTorch