Site Reliability Engineer I
Tekion
15 days ago
Bengaluru, India
Entry Level / Mid Level
H1B Sponsor
Responsibilities
- Drive OpenTelemetry adoption by migrating from legacy agents to the OTel Collector.
- Automate Kubernetes tasks by creating standardized base images with pre-configured OTel agents.
- Refine metric ingestion strategies to address high cardinality issues.
- Build actionable dashboards and alerts in New Relic, Observe, and Grafana.
- Participate in on-call incident response using tools like PagerDuty.
- Automate manual operational tasks using Python, Go, or Bash.
Requirements
- 1–3 years of experience in SRE, DevOps, or Software Engineering.
- Proficient in Kubernetes, including clusters, pods, and deployments.
- Experience with observability tools like New Relic, Prometheus, or Grafana.
- Familiarity with public cloud platforms such as AWS, GCP, or Azure.
- Comfortable writing scripts in Python, Go, or Java.
- Curious mindset with a focus on understanding system failures.
Tech Stack
AnsibleAWSAzureBashDatadogGoGoogle Cloud PlatformGrafanaJavaKubernetesPrometheusPythonTerraform
Categories
DevOps