Site Reliability Engineer - Storage Engineer
GoDaddy
3 months ago
Toronto, Canada
Mid Level / Senior
H1B Sponsor
Responsibilities
- Automate and maintain day-to-day operations of storage systems.
- Develop and maintain tools and automation scripts for storage operations.
- Monitor system performance and implement solutions for high availability.
- Participate in agile practices including daily stand-ups and code reviews.
- Continuously improve system reliability and performance through monitoring and optimization.
Requirements
- 2+ years of professional experience with Ceph in a production environment.
- 2+ years of experience in site reliability engineering or a similar role.
- Experience with deployment, configuration, and management of Ceph clusters.
- Proficiency in Linux/Unix systems with a focus on automation.
- Proficiency in Python or Bash scripting.
- Experience with Ansible, Terraform, or SaltStack.
- Familiarity with Nagios-based monitoring tools like Icinga2.
- Experience with observability tools such as Prometheus and Grafana.
- Solid understanding of core networking concepts related to Linux/Unix systems.
Benefits
- Paid time off and retirement savings options.
- Bonus/incentive eligibility and equity grants.
- Participation in employee stock purchase plan.
- Competitive health benefits and family-friendly perks including parental leave.
- Support for diverse culture and employee resource groups.
Tech Stack
AnsibleAWSBashDockerGrafanaKubernetesLinuxNagiosOpenStackPrometheusPythonTerraform
Categories
BackendDevOps