25 days ago
Stockholm, SwedenMid Level / Senior
Responsibilities
- Build, ship, and operate foundational platform services with full ownership.
- Maintain a high-signal observability stack and translate signals into action.
- Define and evolve SLIs/SLOs, alerting, and reliability reporting for critical systems.
- Improve on-call and incident response processes, including escalation paths and follow-ups.
- Reduce toil through automation and improved system ergonomics.
- Collaborate with product and platform engineers to design resilient systems.
Requirements
- Significant experience operating and improving production systems.
- Ability to debug under pressure and prevent repeat incidents.
- Comfortable writing software and building automation for reliability.
- Autonomous with pride in the quality and resilience of systems.
- Strong observability, incident management, and on-call experience.
- Experience with cloud infrastructure and Kubernetes.
Tech Stack
Kubernetes
