Staff Machine Learning Platform Engineer
Faire
9 days ago
San Francisco, CA, USA
Staff+
Base Salary
$224k - $308k/yr
Responsibilities
- Design and operate ML infrastructure, including workspaces, clusters, jobs, and workflows.
- Productionize ML workloads using Spark, Delta Lake, MLflow, and Databricks Workflows.
- Teach data scientists how to utilize the ML platform for model development.
- Implement Unity Catalog for data governance and secure multi-tenant usage.
- Build CI/CD pipelines for ML using Terraform and Git-based workflows.
- Optimize performance, reliability, and cost across training and inference workloads.
- Configure IAM and RBAC for sensitive data sets.
- Establish observability for data quality, model performance, and platform health.
- Build and maintain ML Platform technical documentation.
Requirements
- 8+ years of experience building production ML or data platforms.
- A degree in Computer Science, Engineering, Statistics, or a related technical field.
- Strong hands-on expertise with Databricks, Spark, Delta Lake, and MLflow.
- Proficiency in Python, SQL, and distributed systems concepts.
- Experience with cloud platforms and infrastructure-as-code.
- Solid understanding of MLOps best practices: CI/CD, monitoring, reproducibility, and security.
- Experience supporting multiple ML teams in a shared platform environment.
- Willingness to take ownership of orphaned problems and learn as needed.
Benefits
- Hybrid work model with in-office attendance three days a week.
- Flexibility to work remotely up to four weeks per year.
- Equity and benefits eligibility.
Tech Stack
Apache AirflowApache KafkaApache SparkAWSDatabricksDatadogDockerGitHub ActionsKotlinKubernetesMLflowMySQLPythonPyTorchSnowflakeSQLTerraform
Categories
AI & MLData ScienceDevOps