Senior Site Reliability Engineer / Kubernetes (Remote)

about 13 hours ago

Rome, Italy +4 moreSenior

Responsibilities

Operate and maintain Linux-based infrastructure (Debian/Ubuntu).
Deploy, manage, and scale Kubernetes clusters across various environments.
Oversee full cluster lifecycle including upgrades and security hardening.
Implement automation for provisioning and operations using Ansible and GitOps.
Design and maintain networking architecture including VLANs and VPNs.
Build automated deployment workflows and maintain observability stacks.
Lead incident response and improve system availability.
Define and implement SLOs/SLIs at multiple infrastructure levels.
Establish and maintain on-call schedules for coverage across timezones.
Develop Standard Operating Procedures (SOPs) for operations and maintenance tasks.

AnsibleBashCloudflareGrafanaGraylogIstioKubernetes LinuxOpenStackPrometheusPython