Software Engineer, Kernel Reliability
Cerebras Systems
about 3 hours ago
Sunnyvale, CA, USA or Toronto, Canada
Entry Level / Mid Level
H1B Sponsor
Responsibilities
- Contribute to the technical roadmap for kernel-centric reliability.
- Partner with System and Cluster Operations to reduce downtime.
- Enhance debug tools to speed up failure analysis.
- Collaborate with software teams to improve the software stack.
- Co-design next-generation architectures with hardware teams.
- Participate in incident response and root-cause analysis.
Requirements
- Strong programming skills in C/C++ and Python.
- Solid foundations in operating systems and computer architecture.
- Ability to debug complex issues using logs and traces.
- Interest in root-cause analysis.
Benefits
- Opportunity to build a breakthrough AI platform.
- Ability to publish and open source cutting-edge AI research.
- Work on one of the fastest AI supercomputers in the world.
- Enjoy job stability with startup vitality.
- Experience a non-corporate work culture that respects individual beliefs.
Tech Stack
CC++Python
Categories
AI & MLEmbedded