GrepJob
TensorWave

Observability Engineer

TensorWave
Apply
4 months ago
Las Vegas, NV, USAMid Level / Senior

Responsibilities

  • Own and evolve the observability and monitoring platform with Grafana and Prometheus.
  • Design, build, and maintain high-quality metrics pipelines using Prometheus.
  • Create clear, actionable Grafana dashboards that effectively communicate system health.
  • Define and maintain meaningful, actionable, and low-noise alerts.
  • Establish and enforce observability standards across services.
  • Partner with engineering teams to ensure proper application instrumentation.
  • Lead improvements to alerting strategies, SLOs, and SLIs.
  • Support incident response by helping teams quickly diagnose issues.
  • Continuously evaluate and improve signal quality and cost.
  • Identify and eliminate observability gaps to prevent outages.

Requirements

  • Strong hands-on experience with Grafana and Prometheus.
  • Deep understanding of metrics-based observability.
  • Experience designing monitoring and alerting systems at scale.
  • Strong knowledge of alerting best practices.
  • Experience with distributed systems and cloud or Kubernetes environments.
  • Ability to analyze system behavior using telemetry.
  • Comfortable collaborating across teams to enhance visibility.

Benefits

  • Mission driven company.
  • Competitive Salary.
  • Stock Options.
  • 100% paid Medical, Dental, and Vision insurance.
  • Life and Voluntary Supplemental Insurance.
  • Short Term Disability Insurance.
  • Flexible Spending Account.
  • 401(k).
  • Flexible PTO.
  • Paid Holidays.
  • Parental Leave.
  • Mental Health Benefits through Spring Health.

Tech Stack

GrafanaHelmKubernetesPrometheusTerraform

Categories