Sr. Site Reliability Engineer - Top Secret Clearance (Application Software, Data)
SpaceX
3 months ago
Hawthorne, CA, USA
Senior
Base Salary
$160k - $220k/yr
Responsibilities
- Upgrade existing distributed systems to become sharded and geo-redundant in multiple data centers.
- Advance existing deployment, monitoring, and alerting infrastructure to support a multi-region environment.
- Manage petabyte scale bare metal compute clusters.
- Closely collaborate with engineers across all programs to create highly operable, scalable, and maintainable products.
- Engage throughout the whole software development lifecycle of services from inception to design, deployment, operation, and iterative refinement.
- Focus on performance bottlenecks and performance improvement techniques.
Requirements
- Bachelor's degree in computer science, engineering, math, or scientific discipline and 5 years of software development experience, or 7+ years of professional experience in site reliability or DevOps.
- Experience with Linux operating systems.
- 5+ years of rigorous experience with site reliability or DevOps.
- Experience with Kubernetes and Istio for on-premise deployment.
- Experience with in-stream data processing and analytics using open source platforms such as Apache Kafka, Spark, HBase, HDFS, Flink.
- Experience troubleshooting hardware and network-layer issues.
- Programming experience in Python, C#, Java, Scala, Go, or similar languages.
- Good understanding of version control, testing, continuous integration, build, deployment, and monitoring.
Benefits
- Comprehensive medical, vision, and dental coverage.
- Access to a 401(k) retirement plan.
- Short & long-term disability insurance and life insurance.
- Paid parental leave and various discounts and perks.
- 3 weeks of paid vacation and eligibility for 10 or more paid holidays per year.
- Paid sick leave in accordance with company policy.
Tech Stack
Apache FlinkApache HBaseApache KafkaApache SparkC#GoIstioJavaKubernetesLinuxPythonScala
Categories
Data EngineeringDevOps