Zeta

Senior Site Reliability Engineer

Zeta

Apply
over 2 years ago
Hyderābād, India
Senior / Mid Level
H1B Sponsor

Responsibilities

  • Understand and monitor application performance by implementing monitoring solutions.
  • Analyze current systems to reduce existing problems and suggest upgrades.
  • Provide support in monitoring, processes, tools, architecture, and Root Cause Analysis.
  • Develop and maintain monitoring and alerting systems to proactively detect issues.
  • Automate routine tasks to enhance system efficiency and minimize downtime.
  • Troubleshoot and resolve incidents and outages.
  • Create automation scripts and tools for efficient system management.
  • Identify improvement areas and design scalable, reliable solutions.
  • Monitor alerts to prevent production outages and maintain run books.

Requirements

  • 4-6 years of sysadmin experience with large-scale distributed systems.
  • Strong foundation in cloud management.
  • Proficiency in Unix shells, Python, and Go programming.
  • Experience with MySQL or PostgreSQL databases.
  • Ability to collaborate effectively in a multifaceted environment.
  • Excellent interpersonal and written communication skills.
  • Strong debugging and troubleshooting skills.
  • Experience with observability tools like Prometheus and Grafana.
  • Hands-on experience with AWS and AWS-CLI.
  • Experience in orchestration and containerization (Kubernetes, Containers).
  • Familiarity with CI/CD tools like Jenkins and ArgoCD.
  • Solid understanding of networking concepts.
  • Experience with Linux OS and shell/Python scripting.
  • Knowledge of API Gateway systems like Kong and Nginx.
  • Understanding of security best practices.
  • BS degree in Computer Science or related technical field.

Tech Stack

Argo CDAWSGoGrafanaJenkinsKubernetesLinuxMySQLPostgreSQLPrometheusPython

Categories

DevOpsSecurity