GrepJob
CodeRabbit

Site Reliability Engineer - Platform Engineering

CodeRabbit
Apply
about 1 month ago
Bengaluru, IndiaSenior
H1B Sponsor

Responsibilities

  • Design, implement, and maintain scalable infrastructure on Google Cloud Platform.
  • Develop, own, and operate critical platform services.
  • Build and maintain Infrastructure as Code using Terraform-Terragrunt.
  • Establish and maintain SLI/SLO frameworks for critical services.
  • Implement monitoring, alerting, observability, and incident management solutions.
  • Conduct incident response and root cause analysis.
  • Optimize application and infrastructure performance and cost.
  • Design and implement chaos engineering practices.
  • Develop self-service platforms and tooling for engineering teams.
  • Automate operational tasks including scaling and security patching.
  • Create and maintain infrastructure APIs and abstractions.
  • Integrate security best practices into infrastructure and platform services.
  • Implement security monitoring and compliance reporting.
  • Design secure network architectures and establish disaster recovery procedures.

Requirements

  • 6-8 years of experience in Site Reliability Engineering, Platform Engineering, or DevOps roles.
  • Proven track record of managing production systems at scale.
  • Strong background with cloud platforms, particularly GCP or AWS.
  • Experience in containerization and orchestration platforms like Kubernetes and Docker.
  • Proficiency in Node.js and TypeScript for building automation tools.
  • Advanced experience with Terraform for infrastructure management.
  • Hands-on experience with monitoring platforms like Datadog.
  • Strong Linux/Unix systems skills.
  • Knowledge of security principles for cloud infrastructure.
  • Familiarity with CI/CD tools and practices.

Benefits

  • Work on cutting-edge technology with real-world impact.
  • Collaborative and innovative environment.
  • Competitive salary, equity, and benefits.
  • Professional development opportunities.

Tech Stack

CircleCIDatadogDockerGitHub ActionsGoogle Cloud PlatformGrafanaJenkinsKubernetesLinuxNode.jsPrometheusTerraformTypeScript

Categories