GrepJob
DoubleVerify

Sr. Site Reliability Engineer I

DoubleVerify
Apply
20 days ago
Manhattan, NY, USASenior / Mid Level
H1B Sponsor

Base Salary

$89k - $178k/yr

Responsibilities

  • Build and maintain the reliability, scalability, and performance of digital media measurement platforms.
  • Implement observability best practices for proactive reliability improvements.
  • Reduce MTTR for critical incidents through automation and improved monitoring.
  • Respond to incidents and manage Sev1/Sev2 situations.
  • Monitor and maintain high availability infrastructure across various environments.
  • Lead technical projects from planning through deployment.
  • Build and deploy automations to improve operational efficiency.
  • Leverage AI-assisted development tools for automation and problem resolution.
  • Implement Infrastructure-as-Code using Terraform and other tools.
  • Create and maintain documentation and runbooks for consistent incident response.
  • Participate in on-call rotations and post-incident reviews.

Requirements

  • 4+ years in Site Reliability Engineering, DevOps, or related operational roles.
  • Proficiency in Linux/Unix systems administration and scripting languages like Python, Bash, or Go.
  • Strong experience with cloud infrastructure across GCP, AWS, and OCI.
  • Expertise in monitoring and observability tools such as Prometheus and Grafana.
  • Hands-on experience with Infrastructure-as-Code tools like Terraform and Ansible.
  • Proven ability to develop and track SLIs, SLOs, and SLAs.

Tech Stack

AnsibleAWSBashGoGoogle Cloud PlatformGrafanaHelmKubernetesMongoDBNagiosPrometheusPythonSnowflakeSplunkSQLTerraform