Site Reliability Engineer (SRE)

6 months ago

Bengaluru, India

Mid Level / Senior

H1B Sponsor

Responsibilities

Design and implement observability dashboards for monitoring.
Establish and maintain service level indicators (SLIs) and service level objectives (SLOs).
Build alerting systems to proactively identify performance issues.
Participate in on-call rotations and lead incident response efforts.
Build and optimize CI/CD pipelines for deployment.
Develop infrastructure as code using tools like Terraform and Ansible.
Ensure security compliance and manage access control systems.
Monitor and optimize cloud resource usage and costs.

Proficiency in at least two programming languages such as Python, Shell, Java, or NodeJS.
Experience with cloud platforms like AWS, GCP, or Azure.
Hands-on experience with Docker and Kubernetes for container orchestration.
Familiarity with monitoring tools like Prometheus and Grafana.
Proficiency in infrastructure as code tools like Ansible and Terraform.
Expert-level knowledge of Git and collaborative development practices.
Experience defining and maintaining SLIs and SLOs.

AnsibleAWSAzureDockerGitGitHub ActionsGitLab CI/CDGoogle Cloud PlatformGrafanaHelmJavaKubernetesPrometheusPythonTerraform