GrepJob
Rainforest

Site Reliability Engineer

Rainforest
Apply
3 months ago
Atlanta, GA, USAMid Level / Senior

Responsibilities

  • Own and scale AWS-based cloud infrastructure using Terraform and IaC orchestration.
  • Build and operate Elastic Kubernetes Service (EKS) and serverless environments for core payments services.
  • Design and maintain CI/CD pipelines with GitLab for fast, safe deployments.
  • Implement monitoring and observability tools to ensure high uptime and quick incident resolution.
  • Automate infrastructure and operational processes to eliminate manual work.
  • Collaborate with application engineers to improve system performance and reliability.
  • Lead incident response efforts and conduct postmortems for continuous improvement.
  • Define and roll out SRE best practices as the company scales.
  • Optimize for cost, security, and compliance in a regulated fintech environment.
  • Support and scale Postgres database infrastructure using AWS RDS offerings.

Requirements

  • 3+ years of experience in SRE, DevOps, or cloud infrastructure roles, preferably in a startup or high-growth environment.
  • Strong hands-on experience with cloud infrastructure (AWS, Google Cloud, Azure).
  • Deep experience with IaC using tools such as Terraform, OpenTofu, Terragrunt, and CloudFormation.
  • Solid production experience with container orchestration (Kubernetes, ECS).
  • Experience building CI/CD pipelines using tools like GitLab and GitHub Actions.
  • Strong understanding of monitoring and observability principles and design.
  • Proficiency in at least one modern programming language (e.g., Python, Java, Go, or Ruby).
  • Bachelor’s degree or equivalent work experience in Information Science, Computer Science, or related disciplines is preferred.

Benefits

  • Comprehensive health benefits package.
  • Unlimited paid time off.
  • Paid parental leave.
  • Fun and flexible working environment.
  • Continuous investment in employee development and company culture.

Tech Stack

AWSAzureGoGoogle CloudJavaKubernetesPostgreSQLPrometheusPythonRubyTerraform

Categories