GrepJob
Reddit

Staff SRE, Ads

Reddit
Apply
about 2 hours ago
Remote, IrelandStaff+
H1B Sponsor

Responsibilities

  • Lead reliability initiatives across multiple Ads domains including ad serving and reporting.
  • Partner with engineering leadership to improve reliability and scalability.
  • Drive architecture reviews and influence technical decisions.
  • Design and build platforms and automation to enhance reliability.
  • Participate in on-call rotations and lead incident investigations.
  • Identify systemic reliability risks and implement long-term solutions.
  • Establish reliability metrics for critical user journeys.
  • Mentor engineers and provide technical leadership.

Requirements

  • 8+ years of experience in Site Reliability Engineering or related roles.
  • Strong experience in high traffic, user-facing production environments.
  • Deep understanding of distributed systems and cloud native architectures.
  • Experience designing highly available systems with strong operational practices.
  • Strong understanding of observability systems including metrics and alerting.
  • Good programming skills in languages such as Go or Python.
  • Experience improving reliability through SLOs and automation.
  • Demonstrated ability to troubleshoot complex issues in distributed systems.

Benefits

  • Global Benefit programs that fit your lifestyle.
  • Family Planning Support.
  • Gender-Affirming Care.
  • Mental Health & Coaching Benefits.
  • Private Medical, Dental, and Vision Benefits.
  • Personal Retirement Savings Account with matching contribution.
  • Cycle to Work and Tax Saver schemes.
  • Flexible Vacation & Paid Volunteer Time Off.
  • Generous Paid Parental Leave.

Tech Stack

Apache FlinkApache KafkaApache SparkClickHouseGoGoogle BigQueryKubernetesPython

Categories

AI & MLBackendData EngineeringDevOps