Site Reliability Engineer II ( SRE II)
LivePerson
6 days ago
Remote, India
Mid Level / Senior
H1B Sponsor
Responsibilities
- Collaborate with Developers, QA, and Product teams during sprint planning.
- Participate in the application release cycle to ensure reliable deployments.
- Manage and operate Kubernetes clusters in GKE and EKS.
- Develop and manage Terraform modules for cloud infrastructure.
- Standardize service deployments using Helm.
- Enhance observability with Prometheus, Grafana, and Datadog.
- Design and maintain GitLab CI/CD pipelines.
- Drive automation by developing scripts in Python, Go, or Shell.
- Participate in a 24/7 on-call rotation for incident management.
- Perform root cause analysis and contribute to post-incident reviews.
- Proactively identify reliability or scalability gaps.
Requirements
- 5-8 years of experience as a Site Reliability Engineer, Platform Engineer, or DevOps Engineer.
- Hands-on experience managing Kubernetes clusters in GCP and AWS.
- Strong knowledge of Terraform, Helm, and GitLab CI/CD pipelines.
- Proficiency in Python, Go, or Shell scripting.
- Experience with observability stacks like Prometheus, Grafana, and Datadog.
- Deep understanding of Linux systems and container orchestration.
- Experience in Agile/Scrum environments.
- Excellent analytical skills with a proactive attitude.
Benefits
- 15 Days PTO plus Casual and Sick Leave.
- 8 Lakhs Family Floater Coverage for insurance.
- Personal Accident and Life Insurance coverage of 3x Gross Annual Salary.
- Flexible working arrangements and career growth opportunities.
Tech Stack
AWSDatadogGitLab CI/CDGoGoogle Cloud PlatformGrafanaHelmIstioKubernetesPrometheusPythonTerraform
Categories
DevOps