site reliability engineer - core and data
CRED
2 months ago
Bengaluru, India
Mid Level
Responsibilities
- Design, implement, and manage scalable, fault-tolerant cloud infrastructure.
- Work closely with engineering teams to translate business requirements into reliable infrastructure systems.
- Operate containerized workloads on AWS using ECS and EKS.
- Build and maintain observability to understand system health and performance.
- Diagnose production issues and restore services under real-world load.
- Automate infrastructure and operations using Infrastructure as Code and CI/CD pipelines.
- Ensure adherence to compliance standards for financial services infrastructure.
- Participate in on-call rotations and incident response, owning problems end-to-end.
Requirements
- 2–5 years of experience working with production infrastructure or backend systems.
- Strong Linux fundamentals and a genuine interest in operating systems.
- Comfortable troubleshooting across systems, containers, and networks.
- Hands-on experience with cloud platforms, preferably AWS.
- Exposure to container orchestration platforms such as ECS or Kubernetes.
- Curiosity about microservice ecosystems and observability.
- Experience managing large, complex distributed systems in production.
- Strong problem-solving skills and proficiency in at least one programming language.
- Exposure to data or platform workloads like Spark, Airflow, or Kafka is a plus.
- Understanding of data pipelines and resource/capacity tuning.
- Experience with observability stacks such as Prometheus or Grafana.
- Contributed to infrastructure or workload cost optimization in cloud environments.
Benefits
- In-house pantry with lunch and dinner provided for all team members.
- Paid sick leaves and comprehensive health insurance.
- No fixed work timings, promoting a flexible work environment.
- Salaries paid before the joining date as a show of trust.
Tech Stack
Apache AirflowApache FlinkApache KafkaApache SparkAWSGrafanaKubernetesLinuxPrometheus
Categories
BackendData EngineeringDevOps