Staff Software Engineer, Observability
Cerebras Systems
17 days ago
Sunnyvale, CA, USA or Toronto, Canada
Staff+
H1B Sponsor
Responsibilities
- Own the long-term architecture and roadmap for the observability platform.
- Design telemetry pipelines for high-cardinality, high-frequency data.
- Partner with teams to define SLOs and build alerting strategies.
- Integrate critical hardware health signals into a unified observability layer.
- Implement instrumentation libraries and standards for observability.
- Mentor senior engineers and influence engineering practices.
Requirements
- 8+ years of software engineering experience, with 4+ years in observability platforms.
- Deep expertise in the open-source observability ecosystem.
- Experience with OpenTelemetry for instrumentation in diverse environments.
- Proficiency in Go preferred, with strong experience in Python.
- Strong distributed systems and Kubernetes expertise.
- Experience with observability cost management and capacity planning.
- Proven track record of setting technical direction across teams.
Benefits
- Opportunity to build a breakthrough AI platform beyond GPU constraints.
- Ability to publish and open source cutting-edge AI research.
- Work on one of the fastest AI supercomputers in the world.
- Enjoy job stability with startup vitality.
- Experience a simple, non-corporate work culture.
Tech Stack
ClickHouseElasticsearchGoKubernetesPrometheusPython
Categories
AI & MLBackendData Engineering