Software Reliability & Platform Engineer
Renewed Visionabout 1 month ago
Remote, WorldwideMid Level / Senior
H1B Sponsor
Responsibilities
- Develop tools and automation to improve reliability, performance, and developer experience using languages like Python, PHP, or Go.
- Design and implement infrastructure for scalability and resilience across AWS, Fly.io, and related cloud services.
- Build developer-facing systems including internal APIs, build pipelines, CI/CD automation, observability tooling, and service orchestration.
- Instrument and monitor applications to detect customer-facing symptoms and surface root causes through dashboards and metrics.
- Partner with developers to build services that are secure, observable, and easy to deploy.
- Define Service Level Objectives (SLOs) and metrics that balance uptime and velocity.
- Participate in on-call rotations, incident response, and postmortems to turn lessons learned into permanent fixes.
- Promote security best practices across builds, deployments, and code reviews.
- Collaborate cross-functionally to guide engineering practices and improve the software delivery lifecycle.
Requirements
- 4+ years of professional experience in software development with exposure to DevOps, SRE, or systems engineering.
- Strong programming skills in one or more languages such as PHP, Python, Go, or TypeScript.
- Experience with cloud platforms like AWS, Fly.io, or DigitalOcean.
- Solid understanding of infrastructure-as-code and automation tools like Terraform, Ansible, or Docker.
- Experience with monitoring and observability tools such as DataDog, Prometheus, or Grafana.
- Familiarity with CI/CD pipelines and deployment strategies.
- Comfortable balancing fast iteration with long-term maintainability and reliability.
- Excellent communication and collaboration skills.
- Able to work Eastern Standard Time hours (9am–5pm).