Minor Hotels Europe and Americas

Site Reliability Engineer

Minor Hotels Europe and Americas

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇵🇹 Portugal

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AWSCloudDistributed SystemsGrafanaJenkinsKafkaKubernetesPostgresPrometheusRedisTerraform

About the role

  • Design and maintain Datadog dashboards for monitoring critical system metrics, including: ➜ Kubernetes metrics. ➜ Application performance metrics. ➜ CI/CD pipeline metrics. ➜ AWS infrastructure metrics.
  • Lead troubleshooting efforts for metric collection, visualization, and any issues in Datadog.
  • Analyze Application Performance Monitoring (APM) data to support both technical and business decision-making.
  • Collaborate cross-functionally with engineering, operations, and product teams to implement performance improvements and resolve reliability challenges.
  • Develop and maintain infrastructure as code (Terraform preferred) to automate and streamline cloud operations.

Requirements

  • Experience in Site Reliability Engineering, DevOps, or similar roles, with a focus on cloud-native technologies and systems.
  • Deep expertise in Datadog, including dashboard creation, metric ingestion, and APM analysis.
  • Strong hands-on experience with Kubernetes, AWS services, and CI/CD pipelines.
  • Proficient in monitoring and logging tools such as Fluentbit, Loki, Prometheus, and Grafana.
  • Solid understanding of infrastructure as code (Terraform preferred).
  • Excellent troubleshooting skills in distributed systems, especially in cloud-native environments.
  • Strong communication skills and experience working with external vendors and stakeholders.
  • Ability to work effectively in a remote, international team environment.
  • Nice to have: Experience with Tekton, Jenkins, Kafka, Redis, and PostgreSQL (Patroni).
  • Familiarity with authentication and authorization tools such as Keycloak or Tozny.
  • Knowledge of artifact and container management platforms like Harbor, ECR, or Minio.
  • Experience in security management, including authentication and authorization processes.
Benefits
  • Join a multicultural and inclusive team environment.
  • Enjoy a supportive atmosphere promoting work-life balance.
  • Engage in exciting national and international projects.
  • Hybrid work.
  • Your career growth is central to our mission.
  • Our array of career growth programs and diverse professionals are crafted to support you in exploring a world of opportunities.
  • Training and certifications programs.
  • Health and life insurance.
  • Referral program with bonuses for talent recommendations.
  • Great office locations.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
DatadogKubernetesAWSCI/CDTerraformFluentbitLokiPrometheusGrafanaKafka
Soft skills
troubleshootingcommunicationcollaborationproblem-solvingremote teamwork
Keyrus

DevOps Engineer

Keyrus
Mid · Seniorfull-time🇵🇹 Portugal
Posted: 3 days agoSource: jobs.keyrus.pt
Cloud