GrepJob
Scopely

Senior Site Reliability Engineer

Scopely
Apply
about 4 hours ago
Mexico City, MexicoSenior
H1B Sponsor

Responsibilities

  • Design and operate observability layers for AI platforms.
  • Build automated findings-to-fix loops for AI and cloud platforms.
  • Implement reliability and hardening controls for internal AI systems.
  • Codify detections, policies, and operational checks as code.
  • Review platform and AI-application changes for reliability.
  • Own AI-platform-specific operational readiness.
  • Continuously improve production readiness through automation.

Requirements

  • 5+ years in SRE, production engineering, platform operations, or security automation.
  • Hands-on scripting and coding experience, especially in Python.
  • Experience building observability and alerting systems in AWS or similar environments.
  • Ability to reduce operational toil through automation.
  • Comfortable with incident handling and evidence-driven postmortems.
  • Interest in AI systems and MCP-style integration risks is valuable.

Tech Stack

AWSPythonTerraform