Staff Software Engineer, Observability

about 2 months ago

H1B Sponsor

Base Salary

$200k - $230k/yr

Responsibilities

Shape the engineering organization standards around observability.
Own and evolve the observability platform, including distributed logging, metrics, and tracing infrastructure.
Build AI-native capabilities to automatically detect anomalies, diagnose failures, and accelerate root cause analysis.
Create powerful developer experiences through dashboards, notebooks, and interactive debugging tools.
Drive reliability automation with intelligent alerting, diagnostics, and incident response systems.
Partner across engineering teams to embed observability and reliability best practices.
Mentor engineers and influence reliability culture across the organization.

Have 8+ years of relevant industry experience building and operating large-scale observability or monitoring infrastructure.
Experience implementing or operating observability platforms such as Datadog, Sentry, Splunk, or similar.
Have strong SWE coding proficiency in at least one of Ruby, Python, or TypeScript.
Strategic systems thinker who identifies high impact opportunities and builds scalable solutions.
Experience operating large scale distributed systems in production, especially logging platforms or time series databases.
Strong fundamentals in systems, networking, and cloud infrastructure such as Kubernetes and AWS.
Thrive in ambiguous environments and roll up your sleeves to solve unscoped problems end to end.
Strong communicator who can align technical and non-technical stakeholders.
Bonus if you have built or contributed to observability ecosystems such as OpenTelemetry or Prometheus.

AWSDatadogKubernetesLinuxPrometheusPythonRubySplunkTerraformTypeScript