Software Engineer, Site Reliability

about 2 months ago

Istanbul, TurkeySenior

H1B Sponsor

Responsibilities

Own and operate Kubernetes infrastructure including lifecycle and upgrades.
Build and maintain CI/CD pipelines and deployment infrastructure.
Automate analysis and resolution of production issues using AI.
Create dashboards, alerting, and anomaly detection systems.
Define SLOs and develop incident response processes.
Manage networking, load balancing, and service mesh configurations.
Drive reliability improvements through automation and chaos engineering.

Requirements

5+ years of experience managing critical production systems.
Strong experience with Kubernetes at scale and infrastructure-as-code tools.
Deep knowledge of Linux and container networking.
Experience building CI/CD systems and GitOps workflows.
Proficiency in Python and either Go or Bash.
Strong experience with logging, monitoring, and alerting tools.
Excellent communication skills and ability to drive technical decisions.
Self-starter with a focus on ownership and continuous improvement.

Benefits

Interesting and challenging work.
Opportunities for learning and growth.
Regular team events and offsites.

Tech Stack

AnsibleBashDatadogGoGrafanaKubernetes LinuxPrometheusPython Terraform

Categories

AI & ML DevOps Security