
Site Reliability Engineer
CodeRabbit4 months ago
Base Salary
$170k - $240k/yr
Responsibilities
- Design, implement, and maintain scalable infrastructure on Google Cloud Platform.
- Own and operate critical platform services.
- Build and maintain Infrastructure as Code using Terraform.
- Establish and maintain SLI/SLO frameworks for critical services.
- Implement monitoring, alerting, and observability solutions.
- Conduct incident response and root cause analysis.
- Optimize application and infrastructure performance.
- Develop self-service platforms and tooling for engineering teams.
- Automate operational tasks including scaling and maintenance.
- Integrate security best practices into infrastructure services.
- Design secure network architectures and establish disaster recovery procedures.
Requirements
- 6-8 years of experience in Site Reliability Engineering, Platform Engineering, or DevOps roles.
- Proven track record managing production systems at scale.
- Experience with cloud platforms, particularly AWS or Google Cloud Platform.
- Strong background in containerization and orchestration platforms.
- Proficiency in Node.js and TypeScript.
- Advanced experience with Terraform for infrastructure management.
- Hands-on experience with monitoring platforms like Datadog.
- Strong Linux/Unix systems skills.
- Knowledge of security principles for cloud infrastructure.
- Familiarity with CI/CD tools and practices.
Benefits
- Work on cutting-edge technology with real-world impact.
- Collaborative and innovative environment.
- Competitive salary, equity, and benefits.
- Professional development opportunities.