Site Reliability Engineer

5 months ago

H1B Sponsor

Base Salary

$170k - $240k/yr

Responsibilities

Design, implement, and maintain scalable infrastructure on Google Cloud Platform.
Own and operate critical platform services.
Build and maintain Infrastructure as Code using Terraform.
Establish and maintain SLI/SLO frameworks for critical services.
Implement monitoring, alerting, and observability solutions.
Conduct incident response and root cause analysis.
Optimize application and infrastructure performance.
Develop self-service platforms and tooling for engineering teams.
Automate operational tasks including scaling and maintenance.
Integrate security best practices into infrastructure services.
Design secure network architectures and establish disaster recovery procedures.

6-8 years of experience in Site Reliability Engineering, Platform Engineering, or DevOps roles.
Proven track record managing production systems at scale.
Experience with cloud platforms, particularly AWS or Google Cloud Platform.
Strong background in containerization and orchestration platforms.
Proficiency in Node.js and TypeScript.
Advanced experience with Terraform for infrastructure management.
Hands-on experience with monitoring platforms like Datadog.
Strong Linux/Unix systems skills.
Knowledge of security principles for cloud infrastructure.
Familiarity with CI/CD tools and practices.