5 days ago
Responsibilities
- Operate and improve Hedera consensus node environments across testnet, previewnet, and preproduction.
- Design and implement automation-first workflows for release and preproduction environments.
- Build and maintain Infrastructure-as-Code (Terraform) on GCP.
- Improve change management, release safety, and operational predictability.
- Participate in on-call rotation, incident response, and RCA, driving corrective actions into automation.
- Partner with internal engineering teams and external stakeholders to support operational requirements.
Requirements
- Strong systems reliability mindset with experience in incident response and RCA.
- Proven ability to automate operational workflows and reduce manual toil.
- Clear communicator with the ability to work across engineering, security, and external partners.
- Deep ownership mentality with a bias toward preventative engineering over reactive fixes.
- Strong Linux and networking troubleshooting in production environments.
- Experience with Infrastructure-as-Code using Terraform.
- Familiarity with configuration management tools like Ansible.
- Experience with CI/CD automation tools such as Jenkins.
- Experience operating distributed systems or production infrastructure at scale.
- Familiarity with Kubernetes fundamentals.