Senior Software Engineer, Reliability

about 5 hours ago

Dublin, IrelandSenior

H1B Sponsor

Responsibilities

Build and operate foundational, security-critical services with a focus on availability and fault tolerance.
Automate infrastructure to reduce operational toil and improve system reliability.
Design and implement systems using SRE best practices.
Define and refine SLIs, SLOs, and error budgets.
Enhance observability, alerting, and incident response.
Participate in on-call rotations with a focus on sustainable operations.
Conduct quantitative analysis to understand system behavior and capacity constraints.
Identify systemic risks and drive long-term solutions.
Collaborate with product, platform, and security engineers.
Mentor and pair with other engineers to improve operational maturity.

Proficient in writing production-quality code (e.g., Python, Go).
Experience with distributed, cloud-native systems and understanding of failure modes.
Familiarity with containerized workloads and platforms (e.g., Kubernetes).
Comfortable with on-call rotations and diagnosing production issues.
Experience designing and operating observability systems.
Knowledge of SRE concepts such as SLIs, SLOs, and error budgets.
Hands-on experience with infrastructure as code (e.g., Terraform).
Experience with capacity planning and performance analysis.
Ability to contribute to post-incident reviews.
Interest in experimenting with AI tools and workflows.