GrepJob
Valtech

Site Reliability Egineer

Valtech
Apply
1 day ago
Bengaluru, IndiaSenior
H1B Sponsor

Responsibilities

  • Maintain and improve observability systems (monitoring, logging, alerting).
  • Define, adjust, and maintain Service Level Objectives (SLOs).
  • Participate in incident resolution and on-call rotations (max 1 week/month).
  • Drive proactive reliability improvements across platforms.
  • Collaborate with teams to analyze failure scenarios and implement mitigations.
  • Create and maintain runbooks for incident response and prevention.
  • Eliminate non-value-adding tasks through automation and process optimization.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
  • 2+ years in DevOps, SRE, or Support Engineering roles.
  • Experience with incident management in high-traffic, public-facing platforms.
  • Strong scripting skills in Python, Bash, or PowerShell.
  • Familiarity with CI/CD tools like GitHub Actions, Azure DevOps, GitLab, Jenkins.
  • Experience with monitoring/APM tools such as Datadog, New Relic, Dynatrace, Prometheus, Grafana.
  • Basic knowledge of serverless services in AWS, Azure, or GCP.
  • Proficiency with Docker and containerized environments.
  • Excellent English communication skills (B2+ level).
  • Experience working in international, cross-cultural teams.

Benefits

  • Flexibility, with hybrid work options (country-dependent).
  • Learning and development, with access to cutting-edge tools, training, and industry experts.

Tech Stack

AnsibleAWSAzureBashChefDatadogDockerGitHub ActionsGoogle Cloud PlatformGrafanaJenkinsPowerShellPrometheusPython

Categories