6 days ago
Remote, IndiaSenior
Responsibilities
- Lead the design and implementation of internal SDKs and self-service frameworks.
- Shift focus from pipeline building to platform engineering for batch and real-time processing.
- Take ownership of the cost-effectiveness of the Databricks ecosystem.
- Tune Spark execution plans and implement auto-scaling strategies.
- Implement Schema-on-Write validation and Data Contracts.
- Partner with Data Architect and Data Stewards to enforce data privacy and security standards.
- Champion the use of AI-assisted development tools to improve code quality.
- Mentor engineers on distributed computing best practices.
Requirements
- 5+ years of experience building and operating production-grade data systems at scale.
- Deep, hands-on mastery of the Databricks/Spark ecosystem.
- Proven track record of building real-time/streaming architectures.
- Experience managing and optimizing cloud costs in a high-growth environment.
- Experience building APIs, tools, or frameworks for internal engineering teams.
Benefits
- Generous paid parental leave.
- Flexible time off.
- Spending accounts.
- Medical, dental, and vision insurance.
- Sabbatical after 5 years.
Tech Stack
Apache KafkaApache SparkDatabricks
Categories
AI & MLData Engineering