GrepJob
New Era Technology

Site Reliability Engineer (SRE)

New Era Technology
Apply
29 days ago
Delhi, IndiaMid Level / Senior
H1B Sponsor

Responsibilities

  • Maintain reliable, scalable, and secure production environments.
  • Implement and manage monitoring, alerting, and logging solutions.
  • Contribute to defining and tracking SLIs/SLOs and support error budget practices.
  • Automate operational tasks to improve efficiency and reduce manual effort.
  • Perform troubleshooting and Root Cause Analysis (RCA) for production incidents.
  • Optimize system performance, availability, and capacity.
  • Maintain runbooks, SOPs, and incident documentation in Confluence.
  • Adhere to change management, deployment governance, and disaster recovery standards.
  • Support incident response for critical production services.
  • Coordinate with external vendors and internal cross-functional teams.
  • Work closely with Engineering, Product Owners, and Operations teams.
  • Manage incidents and changes using ServiceNow & JIRA.
  • Collaborate through Slack and structured communication channels.

Requirements

  • 3+ years of experience in Site Reliability Engineering or related field.
  • Strong knowledge of Windows and Linux/Unix systems.
  • Solid understanding of networking fundamentals (DNS, TCP/IP, Load Balancing, Firewalls).
  • Experience with at least one cloud platform (AWS, Azure, or GCP).
  • Proficiency in one scripting/programming language (Python, Go, Bash, PowerShell, or Java).
  • Understanding of CI/CD pipelines and automation practices.
  • Hands-on experience with Docker and Kubernetes.
  • Experience with monitoring tools such as Grafana or Power BI.
  • Experience with ServiceNow & JIRA (incident/change/problem workflows).
  • Working knowledge of Confluence for technical documentation.