Cerebras Systems

Performance Reliability Engineer

Cerebras Systems

Apply
3 months ago
Sunnyvale, CA, USA or Toronto, Canada
Mid Level / Senior
H1B Sponsor

Responsibilities

  • Characterize and enhance the performance and reliability of advanced ML hardware/software systems.
  • Analyze ML workloads, software kernels, and hardware architecture for power and performance impacts.
  • Develop creative software solutions to improve reliability and performance.
  • Influence the design of Cerebras' next-generation AI architecture and software stack.
  • Partner with ML engineers, researchers, and reliability specialists to understand model behavior.
  • Collaborate with teams in architecture, silicon, and research to advance computational platforms.

Requirements

  • BS, MS, or PhD in Computer Science, Electrical Engineering, or a related field.
  • 3+ years of relevant experience in performance engineering, reliability, computer architecture, and/or software design.
  • Proficiency in Python or other scripting languages.
  • Experience with C/C++ and assembly programming.
  • Demonstrated expertise with system-level performance and reliability optimization.
  • Strong verbal and written communication skills.
  • Nice to have: Hands-on experience with ML models and frameworks.
  • Nice to have: Understanding of thermal management principles and power delivery for advanced semiconductors.

Benefits

  • Opportunity to build a breakthrough AI platform beyond the constraints of the GPU.
  • Ability to publish and open source cutting-edge AI research.
  • Work on one of the fastest AI supercomputers in the world.
  • Enjoy job stability with startup vitality.
  • Experience a simple, non-corporate work culture that respects individual beliefs.

Tech Stack

CC++Python

Categories

AI & MLData Engineering