Senior Site Reliability Engineer

3 months ago

Melbourne, Australia or London, United KingdomMid Level / Senior

H1B Sponsor

Responsibilities

Participate in on-call and incident response, leading incidents over time.
Identify and drive fixes for recurring issues and reliability risks.
Operate and improve Kubernetes clusters and cloud infrastructure.
Enhance observability through improved dashboards and alerts.
Automate repetitive tasks and simplify operational processes.
Support safe change with improved deployment and rollback mechanisms.
Write and maintain runbooks and participate in post-mortems.
Collaborate with engineers to improve service reliability.

Requirements

3–6+ years in SRE, DevOps, or operations-heavy engineering roles.
Experience supporting production systems and on-call rotations.
Comfortable debugging live systems under pressure.
Experience operating cloud infrastructure, preferably AWS.
Working knowledge of Kubernetes and containerized workloads.
Experience with Infrastructure as Code tools like Terraform.
Familiarity with monitoring and alerting tools such as Datadog.
Scripting or automation experience in Python or Bash.

Benefits

Equity from day one, sharing in the company's success.
Personal development budget and wellness days.
Flexible hybrid work environment with 3 days in the office.
Opportunity to work alongside world-class talent.
Impactful role in shaping international expansion.

Tech Stack

AWSBashDatadogKubernetesPrometheusPython Terraform

Categories

DevOps Security