GrepJob
Babylist

Staff Engineer, Site Reliability

Babylist
Apply
about 3 hours ago
Toronto, CanadaStaff+
H1B Sponsor

Responsibilities

  • Manage and evolve AWS environment using Terraform.
  • Own the speed and reliability of CI systems for the Engineering organization.
  • Support developers across local, staging, and production environments.
  • Establish monitoring and alerting standards for actionable insights.
  • Lead incident response and drive post-incident reviews.
  • Contribute to architectural decisions shaping infrastructure evolution.

Requirements

  • Deep hands-on expertise in Terraform.
  • Proven experience with AWS at scale, including EKS and RDS.
  • Experience operating Kubernetes in production environments.
  • Comfortable designing and improving CI/CD systems.
  • Strong observability instincts with tools like Datadog and Sentry.
  • Experienced in on-call and incident management.
  • Familiarity with AI tools to enhance work efficiency.

Benefits

  • Company-paid medical, dental, and vision insurance.
  • Retirement savings plan with company matching.
  • Generous paid parental leave and PTO.
  • Paid week off at the end of the year for all employees.
  • Remote work stipend for office setup.
  • Perks for physical, mental, and emotional health.

Tech Stack

AWSCircleCIDatadogGitHub ActionsKubernetesMySQLRedisRuby on RailsTerraform

Categories