Lead Site Reliability Engineer
Zeta
over 2 years ago
Hyderābād, India
Mid Level / Senior / Staff+
H1B Sponsor
Responsibilities
- Establish a Site Reliability Engineering (SRE) site and build an effective team.
- Provide technical leadership and collaborate with partner team leads and cloud leadership.
- Guide team members on managing availability and performance of critical services.
- Manage project priorities, deadlines, and deliverables.
- Lead incident management during service incidents.
- Drive Mean Time to Recovery (MTTR) as per Incident Service Level Agreements (SLA).
- Ensure 100% coverage for various alerts across applications, infrastructure, and security.
Requirements
- 6-10 years of experience in distributed systems, storage systems, or databases.
- Strong knowledge of algorithms and data structures.
- Experience with Unix/Linux systems internals and administration.
- Proficient in designing, analyzing, and troubleshooting large-scale distributed systems.
- Hands-on experience with MySQL or PostgreSQL databases.
- Experience operating with Kubernetes and cloud environments.
- Excellent communication skills and a systematic problem-solving approach.
Tech Stack
KubernetesLinuxMySQLPostgreSQL
Categories
BackendDevOps