about 3 hours ago
Remote, WorldwideSenior
Base Salary
$130k - $140k/yr
Responsibilities
- Act as a first responder for system incidents and outages.
- Own and evolve our monitoring, alerting, and log management systems.
- Manage and optimize our database infrastructure including MySQL, Postgres, Clickhouse, and Redis.
- Maintain and improve our server infrastructure and deployment pipelines.
- Collaborate closely with engineering teams to build scalable, resilient systems.
- Contribute to internal SRE tooling and automation efforts.
Requirements
- Strong alignment with company values.
- Proficient in English at a CEFR Level C2 / ILR Level 5.
- Deep expertise with AWS and Kubernetes.
- 5+ years of experience in a Site Reliability, DevOps, or Infrastructure Engineering role.
- Proven experience scaling production systems in a high-growth environment.
- Practical experience using AI tools to improve engineering productivity.
- Experience managing incident response and production system outages.
- Hands-on experience with database operations and optimization.
- Familiarity with observability tooling, monitoring, and logging best practices.
- Based in North or South America for timezone alignment.
Benefits
- Fully remote work from anywhere in the world.
- 35 days of PTO annually and a paid sabbatical after 5 years.
- Generous U.S. benchmarked compensation and startup equity.
- 100% medical coverage for you and your family.
- Parental leave for expanding families.
- Home office stipend to help set up your workspace.
- Learning & development stipend for professional growth.
- Annual bonus potential for eligible roles.
- Company retreats twice a year in beautiful locations.
