Site Reliability Engineer II - LATAM

about 4 hours ago

Remote, Costa Rica +3 moreMid Level / Senior

H1B Sponsor

Responsibilities

Support the availability and durability of critical services across production environments.
Monitor service health using SLIs, SLOs, and error budgets, escalating issues when necessary.
Participate in on-call rotations, incident response, and post-incident reviews.
Develop automation for common operational tasks to reduce manual intervention.
Contribute to monitoring, logging, and alerting frameworks.
Work with CI/CD pipelines and infrastructure as code tools.
Document systems and share learnings to foster a reliability-minded culture.

Bachelor’s degree in Computer Science, Engineering, or related field.
2–4 years of experience in site reliability, systems engineering, or operations.
Solid Linux systems administration and troubleshooting skills.
Familiarity with service reliability concepts and incident response.
Proficiency in at least one scripting language (Python, Bash, or Go).
Understanding of containers and microservices concepts.