Full Stack LLM Engineer

Cerebras Systems

5 months ago

Bengaluru, India

Senior / Staff+

H1B Sponsor

Responsibilities

Contribute to the end-to-end bring up of frameworks for RL, inference serving, and ML models on Cerebras CSX systems.
Work across the stack including model architecture translation, graph lowering, compiler optimizations, and runtime integration.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field with 8 to 12 years’ experience.
Comfort navigating the full AI toolchain including Python modeling code and performance profiling.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks such as PyTorch and TensorFlow.
Proficiency in C/C++ programming and experience with low-level optimization.
Strong background in optimization techniques, particularly those involving NP-hard problems.

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

CC++PythonPyTorchTensorFlow