GrepJob
Sanity

Senior Site Reliability Engineer

Sanity
Apply
about 3 hours ago
Remote, United StatesSenior

Responsibilities

  • Design, build, and operate GCP infrastructure, Kubernetes, and CI/CD pipelines.
  • Diagnose and troubleshoot complex distributed systems at high request volumes.
  • Ensure observability and analyze system behavior.
  • Contribute to modernizing edge, caching, and gateway layers.
  • Raise reliability standards through improved dashboards and incident response.
  • Build automation for safer deployments and production readiness.
  • Mentor engineers through code and design reviews.
  • Participate in on-call rotations and support developer on-call processes.

Requirements

  • Based in the United States with overlap with European engineering hours.
  • 5+ years of experience in an SRE on-call rotation.
  • Experience with SRE/DevOps tools and culture.
  • Analytical skills for designing and optimizing infrastructure.
  • Experience managing scalable, cloud-based applications.
  • Proficient in Kubernetes for container orchestration.
  • Experience building CI/CD pipelines.
  • Familiarity with observability tools like Prometheus.
  • Comfortable with CDNs, edge, gateways, and caching layers.
  • Strong communication skills for handling incidents.

Benefits

  • Highly-skilled, inspiring, and supportive team.
  • Real infrastructure scale and meaningful work.
  • Flexible and trust-based work environment.
  • Diverse global team and customer base.
  • Comprehensive health plans and perks.
  • Healthy work-life balance.
  • Competitive stock options and location-based salary.

Tech Stack

Categories