Senior Site Reliability Engineer

about 4 hours ago

Remote, WorldwideSenior

H1B Sponsor

Responsibilities

Lead solution discovery and delivery for complex reliability and infrastructure problems.
Contribute to the platform's architecture, tooling, and roadmap.
Define and operate reliability practices, including SLOs/SLIs and alerting.
Resolve cross-team requests and identify systemic issues.
Operationalize AI workflows for team efficiency.
Mentor junior engineers and participate in hiring and onboarding.
Collaborate with Security on platform hardening and incident response.

Requirements

Solid professional experience in SRE, DevOps, or Platform Engineering.
Hands-on experience with Kubernetes and container tooling.
Experience building and managing cloud infrastructure on AWS.
Strong infrastructure-as-code practice with Terraform.
Familiarity with reliability frameworks like SLOs and SLIs.
Solid observability background with tools like OpenTelemetry and Grafana.
Proficiency with CI/CD and deployment automation.
Comfortable with Golang and scripting languages.
Practical use of AI in infrastructure and operations.
Clear communication skills in an async-first environment.
Proactive and collaborative mindset.

Benefits

Work from anywhere.
Flexible paid time off.
Flexible working hours in an async environment.
16 weeks paid parental leave.
Mental health support services.
Stock options.
Learning budget.
Home office budget and IT equipment.
Budget for local in-person social events or co-working spaces.

Categories

AI & ML DevOps Security