GrepJob
The Trade Desk

Sr Site Reliability Engineer

The Trade Desk
Apply
about 12 hours ago
Sydney, Australia
Senior
H1B Sponsor

Responsibilities

  • Design, build, and scale a global network platform across physical datacenters and multi-cloud environments.
  • Support thousands of hosts worldwide by engineering reliable solutions for petabyte-scale data challenges.
  • Own troubleshooting and resolution of complex network issues to maintain high availability and performance.
  • Lead root cause analysis and postmortems to turn incidents into actionable improvements.
  • Eliminate toil by building tools and automating workflows.
  • Participate in a global on-call rotation to share responsibility for network integrity.

Requirements

  • 6-8 years of hands-on network automation and operational experience in large-scale production infrastructure.
  • Strong development and networking experience with a software-first mindset.
  • Deep expertise in TCP/IP, OSI model, and large-scale IP networking protocols like BGP and OSPF.
  • Hands-on experience with Kubernetes networking technologies such as Cilium and Calico.
  • Experience managing software load balancers like NGINX Ingress, Envoy, or HAProxy.
  • Skilled in troubleshooting and performance tuning in Kubernetes and Docker environments.
  • Proficient in advanced networking technologies including IPv6, SDN, and QoS implementations.
  • Experience operating network devices at scale using various network operating systems.
  • Comfortable with monitoring and alerting systems, using tools like Prometheus and Grafana.
  • Proficient in creating automation and building tools using Python or Go.
  • Interest or background in platform engineering for large-scale distributed systems.

Tech Stack

Alibaba CloudAmbassadorAWSAzureDockerGoGrafanaKubernetesPrometheusPython

Categories

BackendDevOps