GrepJob
Geotab

Site Reliability Engineering

Geotab
Apply
1 day ago
Atlanta, GA, USA
Mid Level / Senior
H1B Sponsor

Responsibilities

  • Act as a primary escalation point for critical production application/product issues.
  • Rapidly troubleshoot complex problems across the application stack using observability tools.
  • Coordinate with development, infrastructure, and technical teams during incidents.
  • Communicate incident status, impact, and resolution steps to stakeholders.
  • Improve monitoring tools and alerting mechanisms for proactive issue detection.
  • Monitor application and system health to ensure high availability.
  • Implement automation tools/scripts to streamline operational tasks.
  • Conduct system tests to validate performance and reliability.
  • Recommend design and process enhancements for application reliability.
  • Participate in post major incident reviews to analyze disruptions.
  • Contribute to a culture of learning from incidents.
  • Participate in a 24x7 on-call rotation for critical issues.

Requirements

  • 3 - 5 years experience in SRE/DevOps/Tier 3.
  • Strong troubleshooting skills with a systematic problem-solving approach.
  • Extensive experience resolving critical incidents in production environments.
  • Proficiency in Linux and operational scripting (Bash, Powershell, Python).
  • Experience with database querying and automated configuration management.
  • Familiarity with cloud platforms and container orchestration.
  • Understanding of application environments for troubleshooting.
  • Excellent verbal and written communication skills.
  • Strong analytical skills and ability to manage multiple tasks.
  • Experience with incident management processes.

Benefits

  • Flex working arrangements.
  • Home office reimbursement program.
  • Baby bonus and parental leave top-up program.
  • Online learning and networking opportunities.
  • Electric vehicle purchase incentive program.
  • Competitive medical and dental benefits.
  • Retirement savings program.

Tech Stack

AnsibleApache SupersetArgo CDAWSAzureBashC#Google BigQueryGoogle Cloud PlatformGrafanaKubernetesLinux.NETPostgreSQLPowerShellPrometheusPython

Categories

DevOps