about 3 hours ago
Remote, WorldwideSenior
Base Salary
$150k - $210k/yr
Responsibilities
- Design and implement self-service tools for product teams to deploy services.
- Identify and automate repetitive manual tasks to reduce operational toil.
- Implement monitoring, alerting, and dashboards for system visibility.
- Participate in on-call rotation to respond to infrastructure incidents.
- Plan capacity and optimize performance for scalable infrastructure.
- Collaborate with security, product engineering, and SRE teams.
Requirements
- 5+ years of experience with distributed systems and microservices in production.
- Strong AWS experience, including EC2, ECS/EKS, VPC networking, and IAM.
- Fluency in Infrastructure as Code, particularly with Terraform or CloudFormation.
- Programming skills in Go, Python, or similar languages for automation.
- Experience with Kubernetes in multi-tenancy production environments.
- Hands-on experience with observability tools like Prometheus and Grafana.
- Incident response experience with a focus on systemic improvements.
- A security-minded approach to system design and implementation.
Benefits
- Significant influence over architectural decisions and technology choices.
- Access to a modern tech stack without legacy systems.
- Sustainable on-call rotation with fair compensation.
- Collaborative culture with design reviews and knowledge sharing.
- Remote-first environment with team meetups for connection.
- Generous health benefits and paid time off.