GrepJob
GitLab

Senior Site Reliability Engineer, Tenant Services: Geo

GitLab
Apply
about 3 hours ago
Remote, India
Senior

Responsibilities

  • Execute Dedicated Geo migrations and cutovers end-to-end.
  • Join the team's shift and weekend coverage rotation for Dedicated cutovers.
  • Operate and improve the Geo operational surface for Dedicated.
  • Design, build, and maintain automation, tooling, and runbooks.
  • Run infrastructure with tools such as Ansible, Chef, Terraform, and Kubernetes.
  • Build and maintain monitoring, alerting, and dashboards.
  • Collaborate closely with the core Geo team and other infrastructure teams.
  • Contribute to readiness reviews, incident reviews, and root cause analyses.
  • Document actions, architecture decisions, and post-incident reviews.
  • Proactively identify and reduce toil by automating operational work.

Requirements

  • Experience operating highly-available distributed systems at scale.
  • Hands-on experience with at least one major cloud provider.
  • Experience with Kubernetes and its ecosystem.
  • Experience with infrastructure as code and configuration management tools.
  • Strong programming skills in at least one general-purpose language.
  • Experience with observability systems and troubleshooting performance issues.
  • Practical exposure to data replication, backup/restore, or migration scenarios.
  • Comfort participating in an on-call rotation and investigating incidents.
  • Ability to engage directly with enterprise customers during migrations.
  • Strong written and verbal communication skills.

Benefits

  • Benefits to support your health, finances, and well-being.
  • Flexible Paid Time Off.
  • Team Member Resource Groups.
  • Equity Compensation & Employee Stock Purchase Plan.
  • Growth and Development Fund.
  • Parental leave.
  • Home office support.

Tech Stack

AnsibleAWSChefGitLab CI/CDGoGoogle Cloud PlatformGrafanaHelmKubernetesPostgreSQLPrometheusPythonRubyTerraform

Categories

DevOps