Network Site Reliability Engineer (NetSRE)
Nebius
14 days ago
Amsterdam, Netherlands or Remote, Worldwide
Mid Level / Senior
Responsibilities
- Define and own reliability goals for network services and critical paths.
- Drive reliability improvements across the entire network.
- Own incident response for your areas and lead investigations.
- Build and evolve observability with actionable metrics and alerting.
- Design safer change workflows for network changes.
- Collaborate with network engineers and platform teams.
Requirements
- Strong production Linux fundamentals and structured debugging skills.
- Solid understanding of networking basics and failure modes.
- Hands-on experience with high-availability systems.
- Ability to write and maintain software/automation, preferably in Go or Python.
- Experience with modern infrastructure tooling and operational automation.
Benefits
- Competitive salary and comprehensive benefits package.
- Opportunities for professional growth within Nebius.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.
Tech Stack
GoLinuxPython
Categories
AI & MLData EngineeringDevOps