19 days ago
Bengaluru, IndiaMid Level / Senior
Responsibilities
- Design and implement automated operational workflows to improve system reliability.
- Build and maintain observability solutions using tools like Datadog.
- Partner with development teams to enhance application reliability and performance.
- Develop and maintain CI/CD pipelines and deployment automation.
- Engineer scalable solutions for production environments across Linux and Windows.
- Automate infrastructure and operational tasks using scripting languages.
- Support and enhance reliability of database platforms like SQL Server and MongoDB.
- Participate in incident response and drive root cause analysis.
- Define and enforce SLOs, SLIs, and error budgets.
- Collaborate with Networking, Platform, and Security teams.
Requirements
- Strong hands-on experience with Linux and Windows operating systems.
- Proven experience building automation and tooling using Python or similar languages.
- Deep understanding of observability and monitoring, preferably with Datadog.
- Experience with CI/CD pipelines and deployment automation tools.
- Operational and performance knowledge of SQL Server and MongoDB.
- Familiarity with cloud platforms like AWS and hybrid architectures.
- Solid understanding of networking concepts such as DNS and TCP/IP.
- Experience working closely with application development teams in an SRE or DevOps role.
- Experience with Kubernetes, OpenShift, and containerized workloads.
- Knowledge of infrastructure-as-code tools like Terraform or CloudFormation.
- Experience implementing automated scaling and performance tuning.
- Background in reliability engineering or DevOps in an enterprise environment.
- Familiarity with security and compliance considerations in production systems.
- Strong bias toward automation over manual processes.
- Focus on improving long-term reliability rather than reactive firefighting.
- Comfortable owning systems end-to-end and driving improvements.
- Clear communication skills for effective collaboration across teams.
- Commitment to the highest ethical standards.
