GrepJob
Replit

Senior Site Reliability Engineer

Replit
Apply
about 5 hours ago
Remote, United Kingdom +5 moreSenior / Mid Level
H1B Sponsor

Responsibilities

  • Design and implement observability solutions for system health and performance.
  • Drive automation and infrastructure as code using tools like Terraform and Ansible.
  • Establish service level objectives (SLOs) and service level indicators (SLIs).
  • Lead incident management efforts and conduct post-mortems.
  • Identify and resolve performance bottlenecks across the infrastructure.

Requirements

  • 4-8 years of experience in Site Reliability Engineering or similar roles.
  • Strong programming skills in languages like Python or Go.
  • Deep understanding of distributed systems.
  • Experience with container orchestration platforms like Kubernetes.
  • Proven track record of implementing monitoring and observability solutions.

Benefits

  • Competitive salary and equity.
  • 401(k) program with a 4% match (US only).
  • Health, dental, vision, and life insurance.
  • Flexible time off (FTO) plus holidays.
  • Monthly wellness stipend.

Tech Stack

AnsibleDatadogGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonTerraform

Categories