about 4 hours ago
Boston, MA, USA
Mid Level
H1B Sponsor
Base Salary
$116k - $174k/yr
Responsibilities
- Develop and maintain scalable data pipelines and core tables using PySpark, Airflow, and dbt.
- Tune Spark jobs and storage patterns for low-latency data retrieval.
- Define clear data contracts with upstream teams and ensure well-documented datasets.
- Monitor data freshness, volume anomalies, and schema changes for reliability.
- Collaborate with Product, Engineering, and AI/ML teams to align metrics with business goals.
- Explore opportunities to integrate AI into workflows for efficiency.
Requirements
- 2+ years of experience in data engineering or a data-intensive software engineering role.
- Fluent in SQL and Python for high-performance querying and data manipulation.
- Hands-on experience with Spark (PySpark/SparkSQL) and cloud environments (AWS/EMR).
- Familiarity with modern modeling tools like dbt and concepts like partitioning and schema evolution.
- Strong focus on performance and latency optimization.
- Collaborative mindset with a curiosity for experimenting with AI tools.
Tech Stack
Apache AirflowApache SparkAWSdbtPythonSQLTerraform
Categories
AI & MLData Engineering