Sr Site Reliability Engineer
The Trade Deskabout 12 hours ago
Sydney, Australia
Senior
H1B Sponsor
Responsibilities
- Design, build, and scale a global network platform across physical datacenters and multi-cloud environments.
- Support thousands of hosts worldwide by engineering reliable solutions for petabyte-scale data challenges.
- Own troubleshooting and resolution of complex network issues to maintain high availability and performance.
- Lead root cause analysis and postmortems to turn incidents into actionable improvements.
- Eliminate toil by building tools and automating workflows.
- Participate in a global on-call rotation to share responsibility for network integrity.
Requirements
- 6-8 years of hands-on network automation and operational experience in large-scale production infrastructure.
- Strong development and networking experience with a software-first mindset.
- Deep expertise in TCP/IP, OSI model, and large-scale IP networking protocols like BGP and OSPF.
- Hands-on experience with Kubernetes networking technologies such as Cilium and Calico.
- Experience managing software load balancers like NGINX Ingress, Envoy, or HAProxy.
- Skilled in troubleshooting and performance tuning in Kubernetes and Docker environments.
- Proficient in advanced networking technologies including IPv6, SDN, and QoS implementations.
- Experience operating network devices at scale using various network operating systems.
- Comfortable with monitoring and alerting systems, using tools like Prometheus and Grafana.
- Proficient in creating automation and building tools using Python or Go.
- Interest or background in platform engineering for large-scale distributed systems.
Tech Stack
Alibaba CloudAmbassadorAWSAzureDockerGoGrafanaKubernetesPrometheusPython
Categories
BackendDevOps