Everbridge

Staff Platform Site Reliability Specialist (Observability & Kubernetes) (copy)

Everbridge

Apply
about 21 hours ago
Montréal, Canada
Staff+
H1B Sponsor

Responsibilities

  • Own the design, operation, and evolution of the observability stack.
  • Build and maintain a highly available, scalable observability platform.
  • Standardize instrumentation, dashboards, alerts, and SLOs.
  • Support incident response, root cause analysis, and capacity planning.
  • Operate and scale Grafana and its associated technologies.
  • Maintain reliability and security of EKS clusters.
  • Manage cluster lifecycle and upgrades.
  • Utilize Terraform for infrastructure provisioning.

Requirements

  • 6+ years of experience in SRE or Platform Engineering.
  • Strong experience with the Grafana ecosystem.
  • Expertise in Kubernetes and Amazon EKS.
  • Proficiency in Terraform.

Benefits

  • Comprehensive healthcare and dental care.
  • Mental health benefits.
  • Disability income benefits.
  • Life and AD&D insurance.
  • Retirement savings plan with employer match.
  • Paid time off.

Tech Stack

AWSGitLab CI/CDGoogle Cloud PlatformGrafanaKubernetesTerraform

Categories

DevOpsSecurity