GrepJob
DevRev

Site Reliability Engineer

DevRev
Apply
15 days ago
Chennai, IndiaMid Level / Senior
H1B Sponsor

Responsibilities

  • Design, build, and maintain infrastructure across AWS, GCP, and Azure using Infrastructure as Code principles.
  • Implement and optimize CI/CD pipelines using tools like Argo and CircleCI.
  • Manage and scale Kubernetes clusters in production environments.
  • Administer and optimize cloud databases for performance and reliability.
  • Develop monitoring, alerting, and observability solutions.
  • Automate routine operational tasks to improve system reliability.
  • Conduct incident response and post-mortem analysis.
  • Collaborate with development teams to design reliable and scalable systems.
  • Document infrastructure architecture and operational procedures.
  • Evaluate and implement new tools and technologies.

Requirements

  • 3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
  • Strong hands-on experience with at least two major cloud providers (AWS, GCP, Azure).
  • Proficiency with Kubernetes for container orchestration.
  • Demonstrated expertise with IaC tools like Terraform or CloudFormation.
  • Experience with CI/CD platforms, particularly Argo and/or CircleCI.
  • Solid understanding of database technologies including MongoDB and RDS.
  • Proficiency in at least one programming or scripting language.
  • Experience with monitoring and observability tools.
  • Experience implementing and managing OpenTelemetry for distributed tracing.
  • Strong understanding of networking, security, and infrastructure best practices.

Tech Stack

Argo CDAWSAzureBashCircleCIGoGoogle Cloud PlatformGrafanaIstioKubernetesMongoDBPrometheusPythonRedisTerraformTypeScript

Categories