GrepJob
PandaDoc

Senior Site Reliability Engineer

PandaDoc
Apply
5 months ago
Remote, PolandSenior
H1B Sponsor

Responsibilities

  • Own and influence the incident management process end-to-end.
  • Maintain and evolve the on-prem observability stack.
  • Participate in on-call rotation to keep production applications running smoothly.
  • Develop automations and tools to support platform reliability.
  • Contribute to production services with performance and resiliency in mind.
  • Collaborate with product engineers to foster SRE principles within the R&D organization.
  • Mentor the SRE team or product engineers.

Requirements

  • Solid programming experience in Python (Django and AsyncIO) and/or Java (Spring Boot).
  • Experience in maintaining an observability tools suite, specifically LGTM (Loki, Grafana, Tempo, Mimir).
  • Experience in development and maintenance of Python services in production.
  • Strong experience with AWS and Kubernetes.
  • Proficiency in working with relational databases (PostgreSQL) and messaging systems (e.g., RabbitMQ, NATS, Kafka).
  • Experience as an on-call SRE engineer.
  • Enjoy hands-on troubleshooting of distributed systems in production environments.
  • Strong communication skills and a desire to share knowledge on reliability.
  • Proficiency in English, both written and spoken.

Benefits

  • Multisport Card for fitness and wellness activities (individual or family plan).
  • LuxMed healthcare coverage (individual or family plan).
  • UNUM life insurance protection (individual or family plan).
  • Onboarding benefit allowance for necessary work equipment and setup.
  • 6 self-care days beyond standard Polish vacation entitlements.
  • Wellness, learning, and development budgets.
  • Opportunities to purchase company stock or receive annual bonuses.

Tech Stack

Apache KafkaAWSDjangoGrafanaJavaKubernetesPostgreSQLPythonRabbitMQSpring Boot

Categories