about 6 hours ago
Remote, WorldwideSenior
Responsibilities
- Build and maintain the host provisioning stack using PXE boot, Ansible, and burn-in agents.
- Evolve the orchestration engine to manage clusters, containers, and VMs.
- Optimize the bin packing algorithm for performance and cost efficiency.
- Own internal tooling for Railway engineers' daily interactions with the fleet.
- Develop observability and alerting systems to preemptively catch fleet issues.
- Design and maintain CI pipelines for safe infrastructure code deployment.
- Define immutable infrastructure using Terraform and Ansible.
- Create Golang/Rust GRPC services to support millions of users.
- Write Engineering Requirement Documents to guide projects from idea to implementation.
Requirements
- Strong understanding of distributed systems and operational resilience.
- Hands-on experience with bare metal provisioning and configuration management.
- Ability to build and operate internal tools for developer experience.
- Intuition for the longevity and scalability of solutions.
- Skill in implementing solutions, creating monitors, and documenting requirements.
- Strong prioritization skills in an ambiguous startup environment.
- Grit to tackle problems, implement solutions, and scale them as needed.
- Excellent communication skills for effective collaboration.
Benefits
- Competitive salary and full health benefits including dependents.
- Strong equity grants and equipment stipend.
- High autonomy with minimal meetings to respect personal time.
- Culture of ownership and opportunity for creative problem-solving.
- Support for personal growth and career development.
