GrepJob
Remote

Senior Site Reliability Engineer

Remote
Apply
about 4 hours ago
Remote, WorldwideSenior
H1B Sponsor

Responsibilities

  • Lead solution discovery and delivery for complex reliability and infrastructure problems.
  • Contribute to the platform's architecture, tooling, and roadmap.
  • Define and operate reliability practices, including SLOs/SLIs and alerting.
  • Resolve cross-team requests and identify systemic issues.
  • Operationalize AI workflows for team efficiency.
  • Mentor junior engineers and participate in hiring and onboarding.
  • Collaborate with Security on platform hardening and incident response.

Requirements

  • Solid professional experience in SRE, DevOps, or Platform Engineering.
  • Hands-on experience with Kubernetes and container tooling.
  • Experience building and managing cloud infrastructure on AWS.
  • Strong infrastructure-as-code practice with Terraform.
  • Familiarity with reliability frameworks like SLOs and SLIs.
  • Solid observability background with tools like OpenTelemetry and Grafana.
  • Proficiency with CI/CD and deployment automation.
  • Comfortable with Golang and scripting languages.
  • Practical use of AI in infrastructure and operations.
  • Clear communication skills in an async-first environment.
  • Proactive and collaborative mindset.

Benefits

  • Work from anywhere.
  • Flexible paid time off.
  • Flexible working hours in an async environment.
  • 16 weeks paid parental leave.
  • Mental health support services.
  • Stock options.
  • Learning budget.
  • Home office budget and IT equipment.
  • Budget for local in-person social events or co-working spaces.