Site Reliability Engineer (SRE)
Five9
2 days ago
Chennai, India
Senior / Mid Level
H1B Sponsor
Responsibilities
- Design and implement comprehensive dashboards for monitoring.
- Establish and maintain SLIs, SLOs, and error budgets.
- Build alerting systems to proactively identify issues.
- Participate in on-call rotations and lead incident response efforts.
- Build and optimize CI/CD pipelines for speed and resilience.
- Develop and maintain infrastructure using tools like Terraform.
- Automate system configuration and ensure consistency across environments.
- Ensure security scanning systems are in place and review vulnerabilities.
- Monitor and optimize cloud resource usage and costs.
- Build and maintain common services and manage database reliability.
Requirements
- 5+ years of experience in a relevant field.
- Proficiency in at least two programming languages such as Python, Shell, Java, or NodeJS.
- Experience with cloud platforms like AWS, GCP, or Azure.
- Hands-on experience with Docker, Kubernetes, and container orchestration.
- Familiarity with monitoring tools like Prometheus and Grafana.
- Proficiency with infrastructure as code tools like Ansible and Terraform.
- Expert-level Git usage and collaborative development practices.
- Experience defining and maintaining SLOs and SLIs.
Benefits
- Work on cutting-edge infrastructure and reliability challenges.
- Exposure to large-scale distributed systems and modern cloud technologies.
- Clear career path toward Senior SRE, Staff Engineer, or Management roles.
- Collaboration with engineering teams across the organization.
Tech Stack
AnsibleAWSAzureDockerGitGitHub ActionsGitLab CI/CDGoogle Cloud PlatformGrafanaHelmJavaKubernetesPrometheusPythonTerraform
Categories
DevOpsSecurity