Pine Software

Senior Site Reliability Engineer

Pine Software

full-time

Posted on:

Location Type: Remote

Location: Ukraine

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Define, implement, and continuously monitor SLA, SLO, and SLI to measure and improve product availability and reliability.
  • Design, configure, and maintain monitoring and alerting systems , including Grafana, VictoriaMetrics, Alertmanager, Grafana OnCall, Kibana, and Elasticsearch; integrate new observability tools as needed.
  • Implement and maintain distributed observability solutions , including monitoring, tracing, and OpenTelemetry-based stacks.
  • Ensure stability, high availability, and reliability of infrastructure and production systems.
  • Participate in incident response , root cause analysis, and post-mortem reviews; drive corrective and preventive improvements.
  • Take part in knowledge-sharing sessions , internal trainings, and documentation efforts; contribute to mentoring and hiring processes when needed.
  • Participate in software release planning and collaborate with stakeholders and management on infrastructure and capacity requirements.
  • Collaborate with stakeholders on the design, maintenance, and regular validation of backup and disaster recovery systems.
  • Provide support to managers, developers, and QA engineers on monitoring, observability, and system analysis topics.

Requirements

  • 7+ years of total working experience
  • HashiCorp Stack: Consul, Vault, Packer, Nomad, Terraform
  • Configuration Management: Ansible
  • Google Cloud Platform (GCP): VPC, GKE, Firewall, Cloud Storage, Compute Engine, Artifact Registry
  • Kubernetes: GKE and on-premises solutions, Helm, Argo CD, SSO
  • Containerization: Working with containers (Docker / containerd / Podman); building and running containers
  • CI/CD: GitLab
  • Networking: Strong fundamentals of network architecture and protocols
  • Programming Languages: Bash, Python
  • Operating Systems: Linux (Debian-based ~85%, RHEL-based ~15%), Windows family (~5%)
  • Databases: PostgreSQL, MySQL, MongoDB
  • Caching Solutions: Redis, Memcached
  • Message Queues: RabbitMQ, Kafka
  • Load Balancing & API Gateways: HAProxy, KrakenD, Kubernetes Gateway API
  • Monitoring & Observability: Prometheus / VictoriaMetrics, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), OpenTelemetry, Vector, Netdata
Benefits
  • Care from Day One – medical insurance immediately upon starting work, including dental care, massage and professional psychological support because your well-being matters
  • Work-Life Balance – 25 days of paid vacation + 30 days of sick leave, so you can recover without unnecessary stress
  • Investment in your energy – partial reimbursement for any sports activities that empowers you.
  • Growth – partial coverage for English or Ukrainian language courses + a fixed budget for professional development. Choose what suits you best!
  • Knowledge Library – books in the office and access to the Kuka online library to learn, grow, and find inspiration.
  • Island Relaxation 14 days a year – enjoy a getaway at the corporate villa in Cyprus.
  • Office of the Future – work at Unit City, where everything is designed for productivity, even during power outages or Modern Office in Larnaca – a stylish space for inspiration: open areas, cozy lounges, and functional meeting rooms – all for your comfort.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
SLASLOSLIOpenTelemetryAnsibleTerraformKubernetesDockerCI/CDNetworking
Soft skills
incident responseroot cause analysisknowledge-sharingmentoringcollaboration