Okta

Senior Site Reliability Engineer- Observability

Okta

Apply
about 3 hours ago
Bengaluru, India
Senior / Mid Level
H1B Sponsor

Responsibilities

  • Lead the design and tuning of Splunk environments for optimal performance.
  • Architect and maintain Grafana dashboards for real-time system health.
  • Design, build, and maintain scalable observability infrastructure using Terraform.
  • Optimize telemetry data collection, processing, and storage for reliability.
  • Develop custom Splunk workflows for automated system event responses.
  • Participate in on-call rotations and lead post-incident reviews.

Requirements

  • Deep, hands-on experience with Splunk administration and search optimization.
  • Proven ability to build actionable dashboards in Grafana.
  • Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role.
  • Strong coding skills in Go, Python, or Ruby.
  • Hands-on experience with OpenTelemetry, Prometheus, or similar frameworks.
  • Deep understanding of Linux internals, networking, and container orchestration.

Benefits

  • Comprehensive benefits package.
  • Opportunities for social impact initiatives.
  • Support for talent development and community building.

Tech Stack

AWSAzureGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonRubySplunkTerraform

Categories

DevOpsSecurity