GrepJob
Gamma

Site Reliability Engineer

Gamma
Apply
6 months ago
San Francisco, CA, USASenior / Staff+
H1B Sponsor

Base Salary

$230k - $310k/yr

Responsibilities

  • Own the reliability, availability, and performance of Gamma's production systems across AWS.
  • Build observability infrastructure including metrics, logging, tracing, and alerting.
  • Design and implement automation to reduce operational toil and improve deployment safety.
  • Lead incident response and conduct blameless post-mortems to drive systemic improvements.
  • Collaborate with engineering teams on architecture reviews and reliability best practices.
  • Manage and optimize compute, networking, databases, and managed services.

Requirements

  • 5+ years in site reliability engineering, DevOps, or systems engineering with AWS expertise.
  • Strong programming skills in Python, Go, or TypeScript/Node.js.
  • Experience with infrastructure-as-code tools like Terraform or CloudFormation.
  • Proven track record of improving system reliability through automation and monitoring.
  • Deep understanding of networking, distributed systems, and containerization technologies.
  • Sharp incident management instincts and debugging skills for complex production failures.
  • Experience scaling SaaS products or familiarity with Kafka and chaos engineering is a plus.
  • AWS certifications or experience with security frameworks like SOC 2 is a plus.

Tech Stack

Apache KafkaAWSDockerGoKubernetesNode.jsPythonTerraformTypeScript

Categories