Senior Site Reliability Engineer
Zeta
over 2 years ago
Hyderābād, India
Senior / Mid Level
H1B Sponsor
Responsibilities
- Understand and monitor application performance by implementing monitoring solutions.
- Analyze current systems to reduce existing problems and suggest upgrades.
- Provide support in monitoring, processes, tools, architecture, and Root Cause Analysis.
- Develop and maintain monitoring and alerting systems to proactively detect issues.
- Automate routine tasks to enhance system efficiency and minimize downtime.
- Troubleshoot and resolve incidents and outages.
- Create automation scripts and tools for efficient system management.
- Identify improvement areas and design scalable, reliable solutions.
- Monitor alerts to prevent production outages and maintain run books.
Requirements
- 4-6 years of sysadmin experience with large-scale distributed systems.
- Strong foundation in cloud management.
- Proficiency in Unix shells, Python, and Go programming.
- Experience with MySQL or PostgreSQL databases.
- Ability to collaborate effectively in a multifaceted environment.
- Excellent interpersonal and written communication skills.
- Strong debugging and troubleshooting skills.
- Experience with observability tools like Prometheus and Grafana.
- Hands-on experience with AWS and AWS-CLI.
- Experience in orchestration and containerization (Kubernetes, Containers).
- Familiarity with CI/CD tools like Jenkins and ArgoCD.
- Solid understanding of networking concepts.
- Experience with Linux OS and shell/Python scripting.
- Knowledge of API Gateway systems like Kong and Nginx.
- Understanding of security best practices.
- BS degree in Computer Science or related technical field.
Tech Stack
Argo CDAWSGoGrafanaJenkinsKubernetesLinuxMySQLPostgreSQLPrometheusPython
Categories
DevOpsSecurity