Zeta

Lead Site Reliability Engineer

Zeta

Apply
over 2 years ago
Hyderābād, India
Mid Level / Senior / Staff+
H1B Sponsor

Responsibilities

  • Establish a Site Reliability Engineering (SRE) site and build an effective team.
  • Provide technical leadership and collaborate with partner team leads and cloud leadership.
  • Guide team members on managing availability and performance of critical services.
  • Manage project priorities, deadlines, and deliverables.
  • Lead incident management during service incidents.
  • Drive Mean Time to Recovery (MTTR) as per Incident Service Level Agreements (SLA).
  • Ensure 100% coverage for various alerts across applications, infrastructure, and security.

Requirements

  • 6-10 years of experience in distributed systems, storage systems, or databases.
  • Strong knowledge of algorithms and data structures.
  • Experience with Unix/Linux systems internals and administration.
  • Proficient in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Hands-on experience with MySQL or PostgreSQL databases.
  • Experience operating with Kubernetes and cloud environments.
  • Excellent communication skills and a systematic problem-solving approach.

Tech Stack

KubernetesLinuxMySQLPostgreSQL

Categories

BackendDevOps