about 3 hours ago
Berlin, Germany
Senior
Responsibilities
- Build out a self-service runtime platform for engineering teams.
- Integrate software development practices into platform engineering.
- Lead the overhaul of CI/CD pipelines in collaboration with product teams.
- Ensure site reliability through observability and disaster recovery solutions.
- Define and operate reliability standards through SLOs and error budgets.
- Drive infrastructure cost optimization across various technologies.
- Improve security posture through tooling and compliance work.
- Collaborate with engineering teams on platform architecture.
- Enhance developer productivity with platform services and tooling.
- Serve as a secondary on-call for incident response.
Requirements
- 5+ years in backend or infrastructure engineering, with 2 years in SRE or platform engineering.
- Hands-on experience with GCP/AWS, Kubernetes, Terraform, and Helm in production.
- Strong software development background in building frameworks and internal tooling.
- Experience with observability platforms like Datadog at scale.
- Proficient in defining and operating SLOs and error budgets.
- Solid understanding of Infrastructure as Code (IaC) and GitOps.
- Proven track record in designing and troubleshooting complex distributed systems.
Tech Stack
AWSDatadogGoogle Cloud PlatformHelmKubernetesMongoDBTerraformTypeScript
Categories
BackendDevOpsSecurity