GrepJob
Sierra

Software Engineer, Site Reliability (SRE)

Sierra
Apply
7 months ago

Base Salary

$230k - $390k/yr

Responsibilities

  • Own Sierra’s observability stack for monitoring, alerting, logging, and tracing.
  • Collaborate with product and platform engineers to design reliable and scalable systems.
  • Design and implement scalable, reliable, and secure cloud infrastructure using Terraform and AWS.
  • Enhance the reliability and scalability of LLM deployments.
  • Lead improvements to deployment pipelines, CI/CD tooling, and incident management processes.
  • Define and influence SRE practices and culture across the engineering organization.

Requirements

  • 5+ years of experience in Site Reliability or Infrastructure engineering roles.
  • Experience designing for availability, scalability, and reliability.
  • Deep knowledge of Terraform, AWS services, and cloud networking.
  • Strong background in observability systems like Prometheus or Grafana.
  • Experience with enterprise customers and their compliance needs.
  • Degree in Computer Science or equivalent professional experience.

Benefits

  • Flexible (unlimited) paid time off.
  • Medical, dental, and vision benefits for you and your family.
  • Life insurance and disability benefits.
  • Retirement plan dependent on country of employment.
  • Parental leave and fertility benefits.
  • Lunch and snacks provided.
  • Discretionary benefit stipend.
  • Free alphorn lessons.

Tech Stack

AWSDatadogGrafanaPrometheusTerraform

Categories