Sr Site Reliability Engineer
Realtor.com Careersabout 5 hours ago
Responsibilities
- Implement and maintain highly available AWS infrastructure including EKS clusters and multi-region architectures.
- Support reliability of critical services such as Skyway, Frontdoor, and Pantheon.
- Monitor SLIs, SLOs, and error budgets for Tier 1/2/3 systems.
- Implement reliability patterns including circuit breakers and automated failover.
- Implement observability solutions using NewRelic for rapid troubleshooting.
- Build dashboards and alerts to reduce MTTD and MTTR.
- Identify infrastructure cost optimization opportunities and implement FinOps practices.
- Execute chaos engineering experiments to identify system weaknesses.
- Participate in on-call rotation for critical systems and conduct post-incident reviews.
- Collaborate with various teams on reliability initiatives and support security compliance.
Requirements
- 5+ years in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
- Bachelor’s degree or equivalent experience.
- 3+ years hands-on experience with AWS and Kubernetes.
- Proficient programming skills in Python, Go, or Java.
- Production experience with observability tools and distributed systems.
- Experience with CI/CD platforms and incident response.
- Preferred: Exposure to chaos engineering tools and API Gateway technologies.
Benefits
- Inclusive and competitive medical, Rx, dental, and vision coverage.
- Family forming benefits.
- 13 paid holidays and flexible time off.
- 8 hours of paid volunteer time off.
- Immediate eligibility into Company 401(k) plan with 3.5% company match.
- Tuition reimbursement program for degreed and non-degreed programs.
- 1:1 personalized financial planning sessions.
- Student debt retirement savings match program.
- Free snacks and refreshments in each office location.