GrepJob
Forto

Senior Site Reliability Engineer

Forto
Apply
about 3 hours ago
Berlin, Germany
Senior

Responsibilities

  • Build out a self-service runtime platform for engineering teams.
  • Integrate software development practices into platform engineering.
  • Lead the overhaul of CI/CD pipelines in collaboration with product teams.
  • Ensure site reliability through observability and disaster recovery solutions.
  • Define and operate reliability standards through SLOs and error budgets.
  • Drive infrastructure cost optimization across various technologies.
  • Improve security posture through tooling and compliance work.
  • Collaborate with engineering teams on platform architecture.
  • Enhance developer productivity with platform services and tooling.
  • Serve as a secondary on-call for incident response.

Requirements

  • 5+ years in backend or infrastructure engineering, with 2 years in SRE or platform engineering.
  • Hands-on experience with GCP/AWS, Kubernetes, Terraform, and Helm in production.
  • Strong software development background in building frameworks and internal tooling.
  • Experience with observability platforms like Datadog at scale.
  • Proficient in defining and operating SLOs and error budgets.
  • Solid understanding of Infrastructure as Code (IaC) and GitOps.
  • Proven track record in designing and troubleshooting complex distributed systems.

Tech Stack

AWSDatadogGoogle Cloud PlatformHelmKubernetesMongoDBTerraformTypeScript

Categories

BackendDevOpsSecurity