Site Reliability Engineer
Recorded Futureabout 1 month ago
Gothenburg, SwedenMid Level / Senior
H1B Sponsor
Responsibilities
- Ensure performance, capacity, scalability, reliability, and security of the platform.
- Make systemic improvements for recurring issues.
- Perform Root Cause Analysis for outages.
- Design and maintain scalable infrastructure on AWS.
- Develop observability solutions using tools like Grafana and ELK.
- Automate infrastructure provisioning using Terraform and Chef.
- Participate in a 24/7 on-call rotation for production incidents.
- Collaborate with engineering teams for high availability applications.
- Identify and address performance bottlenecks.
- Drive continuous improvement through automation and process optimization.
Requirements
- 3+ years of experience in Site Reliability Engineering or similar roles.
- Extensive hands-on experience with AWS and networking concepts.
- Expert-level troubleshooting and diagnostic skills.
- Proven track record of reducing system downtime.
- Advanced Linux skills including networking and storage.
- Experience managing observability suites like Grafana and ELK.
- Strong proficiency in Terraform and Chef.
- Preference for automating tasks via Infrastructure as Code.
- Ability to create clear incident reports and technical documentation.
- Strong collaboration and communication skills.