GrepJob
Reddit

Staff Site Reliability Engineer - Site Experience

Reddit
Apply
about 7 hours ago
Dublin, IrelandStaff+
H1B Sponsor

Responsibilities

  • Lead reliability engineering for user experience by driving operational excellence for critical systems.
  • Architect scalable systems in collaboration with product and infrastructure teams.
  • Identify and mitigate systemic risks and reliability bottlenecks.
  • Automate operational tasks to improve deployment safety and incident response.
  • Manage complex incident responses and ensure sustainable fixes are implemented.
  • Define best practices for reliability engineering and operational maturity.
  • Mentor engineers and shape the reliability culture across the organization.

Requirements

  • 8+ years of experience in Site Reliability Engineering or related roles.
  • Strong collaboration and communication skills to influence technical direction.
  • Experience supporting high traffic, user-facing production environments.
  • Deep understanding of distributed systems, networking, and cloud native architectures.
  • Experience designing highly available systems with strong operational practices.
  • Strong programming skills in languages such as Go or Python.
  • Understanding of observability systems including metrics and alerting.
  • Ability to troubleshoot complex issues across applications and infrastructure.

Benefits

  • Global benefit programs that fit your lifestyle, including workspace and professional development support.
  • Family planning support and gender-affirming care.
  • Mental health and coaching benefits.
  • Private medical, dental, and vision benefits.
  • Personal retirement savings account with matching contributions.
  • Cycle to work and tax saver schemes.
  • Flexible vacation and paid volunteer time off.
  • Generous paid parental leave.

Tech Stack

AmbassadorApache CassandraApache KafkaClickHouseGoGrafanaKubernetesPrometheusPythonRedis

Categories