about 2 months ago
Responsibilities
- Extend and operate the Data Factory infrastructure for CFD simulations.
- Design job scheduling systems to maximize throughput and handle failures.
- Build monitoring systems to detect simulation failures and resource bottlenecks.
- Create high-performance data pipelines for ML-ready training data.
- Implement geometry preprocessing workflows and post-processing pipelines.
- Conduct comprehensive validation checks at every pipeline stage.
- Deliver validated datasets to downstream ML training infrastructure.
- Design data versioning and cataloging systems for reproducible training.
Requirements
- 5+ years of experience in data engineering, HPC engineering, or simulation infrastructure.
- Strong experience with orchestration systems like SLURM, Kubernetes, or Temporal.
- Proven ability to build and operate reliable data pipelines.
- Proficiency in Python for pipeline development and automation.
- Familiarity with Linux, networking, and storage systems.
- Experience with cloud infrastructure, ideally GPU/HPC-focused clouds.
- Background in HPC for simulation engineering, particularly CFD or FEA.
- Experience with geometry processing and scientific data formats.
Benefits
- Equity options to share in the company's success.
- 10% employer pension contribution for future investment.
- Free office lunches to keep you energized.
- Enhanced parental leave with full pay for new parents.
- YellowNest nursery scheme to assist with childcare costs.
- 25 days of annual leave plus public holidays.
- Private medical insurance with 100% employee cover.
- Wellhub subscription for access to wellness resources.
- Confidential Employee Assistance Programme for wellbeing support.
- Bike2Work scheme and season ticket loan for easier commuting.
- Octopus EV salary sacrifice for sustainable driving options.
Tech Stack
Categories
AI & MLData Engineering
