15 days ago
Paris, FranceMid Level / Senior
Responsibilities
- Design and implement scalable, reliable, and fault-tolerant systems across cloud environments.
- Develop and maintain observability tools, including monitoring, logging, and alerting.
- Automate infrastructure provisioning, deployment, and incident response using IaC tools.
- Optimize system performance, scalability, and incident response workflows.
- Collaborate with development and DevOps teams to enhance system reliability.
- Conduct root cause analysis and implement preventative measures.
- Ensure high availability through load balancing and disaster recovery strategies.
- Improve CI/CD pipelines for faster and stable deployments.
- Optimize cloud cost and resource utilization across major platforms.
- Participate in on-call rotations to address system failures.
Requirements
- Around 4+ years of experience in Site Reliability Engineering, DevOps, or System Engineering.
- Strong knowledge of cloud platforms like AWS, Azure, or GCP.
- Experience with observability and monitoring tools such as Prometheus and Grafana.
- Proficiency in Infrastructure as Code tools like Terraform or CloudFormation.
- Hands-on experience with containerization and orchestration technologies.
- Strong Linux system administration and networking fundamentals.
- Experience with incident management and root cause analysis.
- Proficiency in scripting languages for automation.
- Knowledge of load balancing and distributed systems.
- Understanding of security best practices and compliance requirements.
- Strong communication skills for cross-functional collaboration.
Benefits
- Apple hardware ecosystem for work.
- Annual Bonus.
- Top-tier Health and Life Insurance.
- Transportation Budget for commute support.
- Coverflex benefits package for meal allowances and well-being.
- Childcare support.
- Air Conference for team collaboration and growth.
- Pension Fund for long-term financial planning.
- Urban Sports Club membership.
- Meals 100% free at the hub.
