about 3 hours ago
Base Salary
$150k - $250k/yr
Responsibilities
- Design and build the storage and indexing layer for multimodal datasets.
- Optimize the query engine with predicate and projection pushdowns.
- Choose and extend modern open formats like Parquet and Iceberg.
- Implement versioning and schema evolution for reproducible datasets.
- Collaborate with the Dataloading team to minimize data transfer.
- Work with the Visual Understanding team to integrate model outputs.
Requirements
- Strong interest in indices and query engines.
- Familiarity with the storage hierarchy and data movement costs.
- Strong opinions on data formats like Parquet, Iceberg, and Delta.
- A passion for databases and query systems, with a habit of reading database papers.
- Experience with storage or table-format teams is a plus.
- Knowledge of Rust or modern C++ for storage engines is desirable.
Benefits
- In-person, tight-knit team working 4 days/week in SF.
- Competitive compensation and meaningful startup equity.
- Catered lunches and dinners for employees.
- Commuter benefits and team-building events.
- Health, vision, and dental coverage.
- Flexible PTO and latest Apple equipment.
- 401(k) plan with match.
