about 4 hours ago
Remote, United Kingdom
Senior / Staff+
H1B Sponsor
Responsibilities
- Design and build automated reliability and self-healing systems for production.
- Own and improve incident management tooling and reduce alert noise.
- Develop observability infrastructure for real-time visibility into system health.
- Contribute to AI-driven operational tooling for autonomous remediation.
- Drive incident prevention by identifying systemic patterns and reducing operational toil.
- Partner with product engineering teams to enhance their operational practices.
- Define and champion operational excellence best practices across engineering.
- Embed Samsara’s cultural principles within the team.
Requirements
- 8+ years of experience in software engineering.
- Bachelor's Degree in Computer Science/Engineering or equivalent experience.
- 3+ years of experience in infrastructure or platform engineering.
- Expertise in observability, operational metrics, and data analysis.
- Proven track record in architecting monitoring frameworks and automated response workflows.
- Experience with large-scale enterprise software applications.
- Familiarity with cloud platforms like AWS or GCP.
- Experience in implementing AI-driven automation in the software development lifecycle.
- Proficient in writing high-quality code in languages like Go or Python.
- Experience mentoring engineers and role modeling engineering practices.
Benefits
- Flexible, employee-led remote work model.
- Professional development stipend.
- Comprehensive health and parental leave plans.
- Above-market total compensation including base salary, bonuses, and equity.
Tech Stack
AWSDatadogGoGoogle Cloud PlatformGrafanaPythonTerraform
Categories
AI & MLDevOps