NICE

Senior Cloud Site Reliability Engineer

NICE

Apply
3 months ago
Pune, India
Senior

Responsibilities

  • Act as part of a team of SREs managing production and developing reliability improvements.
  • Lead investigations into root cause outages, performance, and cost issues.
  • Develop automation for low-value tasks while balancing project delivery demands.
  • Provide technical leadership to Cloud Operations and Support teams.
  • Collaborate with DevOps and engineering teams to establish and enforce SLOs, SLAs, and error budgets.
  • Develop and configure monitoring dashboards and alerts using tools like Grafana and Azure Monitor.
  • Install and configure observability platforms including Grafana, Prometheus, and Azure Monitor.
  • Develop Bicep modules for monitoring infrastructure.
  • Optimize system performance, cost, and security through regular reviews and tuning.

Requirements

  • Must have 5+ years of experience in Site Reliability Engineering.
  • Excellent technical, analytical, and troubleshooting skills.
  • In-depth knowledge of databases and data handling (MS-SQL, Elasticsearch, YML, JSON, XML).
  • Significant experience in programming or advanced scripting (Python, PowerShell, C#).
  • Experience with infrastructure/configuration as code and version control (ARM, BICEP, Git).
  • Strong experience managing monitoring, alerting, and dashboarding platforms (Azure Monitor, Prometheus, Grafana).
  • Demonstrable experience supporting live cloud services and platforms.
  • Expertise in developing queries for dashboards and alerting for microservices.
  • Expertise in developing custom metrics for microservices.
  • Production experience with Kubernetes and containerization.
  • Exposure to commercial cloud providers (Ideally Azure, others considered).
  • Exposure to Azure DevOps pipelines is desirable (CI/CD).
  • Exposure to test frameworks is desirable (NUnit, Jasmine, Selenium).
  • Strong experience in infrastructure as code, design, and implementation strategies.
  • Efficient, effective, and respectful communication skills with customers and internal departments.

Benefits

  • Join a global company with endless internal career opportunities.
  • Work in a fast-paced, collaborative, and creative environment.
  • Enjoy the NiCE-FLEX hybrid model with 2 days in-office and 3 days remote work.

Tech Stack

AzureC#ElasticsearchGitGrafanaKubernetesPowerShellPrometheusPython

Categories

Data EngineeringDevOpsSecurity