Senior Staff Cloud Support Engineer

5 months ago

San Francisco, CA, USA or Sunnyvale, CA, USASenior / Staff+

H1B Sponsor

Base Salary

$180k - $220k/yr

Responsibilities

Serve as the highest-level escalation point for complex P1/P0 incidents.
Lead cross-functional root cause investigations involving various technical layers.
Design and improve node validation and release readiness processes.
Influence Kubernetes architecture and workload orchestration for stability.
Troubleshoot AI/ML infrastructure issues and support complex workloads.
Act as a senior technical advisor during high-risk customer incidents.
Mentor P3/P4 engineers and define technical standards for support excellence.

Requirements

8+ years of experience in SRE, DevOps, HPC, or Cloud Infrastructure roles.
Advanced expertise in Linux systems.
Deep operational experience with Kubernetes (CKA-level or higher).
Strong networking knowledge including Infiniband and RDMA.
Experience supporting AI/ML workloads at scale.
Proven track record of resolving multi-layer, distributed system failures.
Strong customer communication and executive-facing presence.

Benefits

Competitive compensation with Restricted Stock Units.
Paid time off and paid holidays.
Comprehensive health, dental, and vision insurance.
Employer contributions to HSA account.
Paid parental leave and life insurance.
Professional development and tuition reimbursement.
Mental health and wellness support.
Commuter benefits and cell phone stipend.
401(k) Retirement plan with company match.
Volunteer time off.

Tech Stack

Kubernetes Linux Terraform

Categories