Senior Site Reliability Engineer

about 3 hours ago

Remote, United StatesSenior

Responsibilities

Design, build, and operate GCP infrastructure, Kubernetes, and CI/CD pipelines.
Diagnose and troubleshoot complex distributed systems at high request volumes.
Ensure observability and analyze system behavior.
Contribute to modernizing edge, caching, and gateway layers.
Raise reliability standards through improved dashboards and incident response.
Build automation for safer deployments and production readiness.
Mentor engineers through code and design reviews.
Participate in on-call rotations and support developer on-call processes.

Requirements

Based in the United States with overlap with European engineering hours.
5+ years of experience in an SRE on-call rotation.
Experience with SRE/DevOps tools and culture.
Analytical skills for designing and optimizing infrastructure.
Experience managing scalable, cloud-based applications.
Proficient in Kubernetes for container orchestration.
Experience building CI/CD pipelines.
Familiarity with observability tools like Prometheus.
Comfortable with CDNs, edge, gateways, and caching layers.
Strong communication skills for handling incidents.

Benefits

Highly-skilled, inspiring, and supportive team.
Real infrastructure scale and meaningful work.
Flexible and trust-based work environment.
Diverse global team and customer base.
Comprehensive health plans and perks.
Healthy work-life balance.
Competitive stock options and location-based salary.

Tech Stack

ElasticsearchGoogle Cloud Platform Kubernetes PostgreSQLPrometheus

Categories

DevOps Security