GrepJob
Wikimedia Foundation

Senior Site Reliability Engineer, Wikimedia Enterprise

Wikimedia Foundation
Apply
about 4 hours ago
Remote, WorldwideSenior
H1B Sponsor

Base Salary

$117k - $181k/yr

Responsibilities

  • Define, track, and improve Service Level Objectives (SLOs), SLIs, and error budgets.
  • Build and enhance observability systems for proactive detection and troubleshooting.
  • Drive reliability engineering practices, including capacity planning and load testing.
  • Improve developer experience by enabling self-service infrastructure.
  • Partner with engineering teams to embed reliability best practices early in development.
  • Design and optimize CI/CD and GitOps workflows for automated deployments.
  • Implement secure-by-default infrastructure and enforce best practices.
  • Continuously optimize infrastructure cost and efficiency using FinOps principles.
  • Establish and track operational metrics to drive continuous improvement.
  • Reduce operational toil by implementing automation-first solutions.
  • Contribute to and evolve internal platform capabilities for scalability.
  • Collaborate with a globally distributed team.
  • Mentor peers in technical and operational areas.

Requirements

  • Experience with Infrastructure as Code and automation tools like Terraform or Ansible.
  • Proficiency in at least one programming language such as Python or Go.
  • Experience designing and operating cloud-based systems on platforms like AWS, Azure, or GCP.
  • Familiarity with CI/CD pipelines and GitOps workflows.
  • Experience with incident response and leading postmortems.
  • Strong understanding of SRE best practices, including SLOs and observability.
  • Ability to work effectively in a distributed, cross-functional environment.
  • Familiarity with Wikimedia or other open source projects is a plus.