NICE

Senior Cloud Site Reliability Engineer

NICE

Apply
20 days ago
Sandy, UT, USA
Senior / Mid Level

Responsibilities

  • Create dashboards for application health observability.
  • Consult with development teams on SRE services.
  • Automate manual activities to reduce toil.
  • Participate in design and scoping of new solutions.
  • Document findings and share with other SREs.
  • Ensure proper monitoring is set up.
  • Identify and implement evolutionary improvements.
  • Meet with Incident Management to discuss major incidents.
  • Assist in data and performance analysis.
  • Review and mentor other SREs.
  • Support services before they go live.
  • Practice sustainable incident response.
  • Create automated end-to-end diagnostics.
  • Communicate effectively with technical and non-technical peers.
  • Coordinate multiple cross-functional initiatives.
  • Participate in project planning efforts.
  • Lead technical direction for testing work.
  • Provide guidance and coaching to team members.
  • Document troubleshooting steps for historical access.
  • Ensure compliance with policies and standards.
  • Implement remediation required by audits.
  • Provide on-call support for high priority incidents.
  • Estimate time for activities and projects.

Requirements

  • Bachelor's degree in Computer Science or related field.
  • 4+ years programming/scripting experience.
  • 4+ years experience in public or private cloud environments.
  • 4+ years of SRE or related experience.
  • Experience with Agile, Jira, GitHub, monitoring, and automation.
  • 6+ years of technical communication in English.
  • Ability to troubleshoot applications effectively.
  • Experience with complex issues across multiple applications.
  • Proactive engagement with peers and stakeholders.
  • Mentoring experience with co-workers.
  • Ability to coordinate work with peers.
  • Willingness to share discoveries and best practices.
  • Self-driven with a focus on improvement.
  • Ability to work with minimal supervision.

Tech Stack

AnsibleC#C++DatadogDockerGrafanaJavaKubernetesPerlPrometheusPythonRubySplunkTerraform

Categories

BackendDevOps