about 4 hours ago
Responsibilities
- Build highly scalable, available, fault-tolerant distributed data processing systems.
- Develop quality data solutions and refine existing datasets into simplified models.
- Create data pipelines that optimize data quality and resilience.
- Own data mapping, business logic, transformations, and data quality.
- Conduct low-level systems debugging and performance optimization.
- Participate in architecture discussions and influence the product roadmap.
- Maintain and support existing platforms while evolving to newer technologies.
Requirements
- Extensive SQL skills.
- Proficiency in at least one scripting language, preferably Python.
- Experience with big data technologies such as Hadoop, Hive, Kafka, Spark, and Airflow.
- Deep expertise in Apache Spark, including performance tuning and building scalable data pipelines.
- Proficiency in data modeling and optimizing data architectures.
- Experience with AWS, GCP, and Looker is a plus.
- 8+ years of professional experience as a data engineer.
- BS in Computer Science; MS in Computer Science preferred.
- AI literacy and a growth mindset.
Benefits
- Comprehensive benefits including mental health and financial wellness support.
- Healthcare options including medical, dental, and vision.
- Life, accident, disability, and retirement options.
- Support for taking time off in accordance with local policies.
Tech Stack
Apache AirflowApache HadoopApache HiveApache KafkaApache SparkAWSGoogle Cloud PlatformPrestoPythonSQL
Categories
Data Engineering