Senior Site Reliability Engineer- Observability
Okta
about 3 hours ago
Bengaluru, India
Senior / Mid Level
H1B Sponsor
Responsibilities
- Lead the design and tuning of Splunk environments for optimal performance.
- Architect and maintain Grafana dashboards for real-time system health.
- Design, build, and maintain scalable observability infrastructure using Terraform.
- Optimize telemetry data collection, processing, and storage for reliability.
- Develop custom Splunk workflows for automated system event responses.
- Participate in on-call rotations and lead post-incident reviews.
Requirements
- Deep, hands-on experience with Splunk administration and search optimization.
- Proven ability to build actionable dashboards in Grafana.
- Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role.
- Strong coding skills in Go, Python, or Ruby.
- Hands-on experience with OpenTelemetry, Prometheus, or similar frameworks.
- Deep understanding of Linux internals, networking, and container orchestration.
Benefits
- Comprehensive benefits package.
- Opportunities for social impact initiatives.
- Support for talent development and community building.
Tech Stack
AWSAzureGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonRubySplunkTerraform
Categories
DevOpsSecurity