
Staff Site Reliability Engineer
AlphaSense7 days ago
Base Salary
$150k - $225k/yr
Responsibilities
- Architect reliability frameworks and self-service tooling for teams.
- Drive AI-driven reliability through automation of diagnostics and proactive failure prevention.
- Embed SRE practices across engineering via design reviews and operational standards.
- Act as Incident Commander during critical events and lead blameless postmortems.
- Deliver end-to-end monitoring and profiling to optimize performance.
- Mentor engineers across SRE and product teams through knowledge sharing.
Requirements
- 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role.
- At least 3+ years in a Senior+ SRE position.
- Strong background in running production SaaS systems at scale.
- Proficiency in at least one programming/scripting language (Python, Go, or similar).
- Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes.
- Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing).
- Experience with monitoring and alerting tools (Prometheus, Grafana, Datadog, ELK).
- Familiarity with advanced observability techniques (OTEL, continuous profiling).
- Proven incident management experience, including leading high-severity incidents.
- Strong troubleshooting skills across the full stack.
- Excellent communication and collaboration skills.