Compass

SRE Analyst – Mid

Compass

full-time

Posted on:

Location Type: Hybrid

Location: BarueriBrasil

Visit company website

Explore more

AI Apply
Apply

About the role

  • Continuously monitor the production environment, tracking ticket queues and alerts via management tools;
  • Analyze and respond to Level 1 incidents, ensuring rapid identification and initial remediation of issues;
  • Execute operational procedures such as service restarts and environment recovery actions;
  • Perform log and metric analysis using observability tools to diagnose failures;
  • Act proactively to detect incidents and service degradation;
  • Set up and lead war rooms, coordinating communication and actions to resolve critical incidents;
  • Escalate incidents to internal teams and vendors when necessary;
  • Track and ensure follow-up on incidents, keeping stakeholders updated, including executive levels;
  • Support the stability and availability of microservices-based applications and distributed environments;
  • Collaborate with development and operations teams to resolve problems and drive continuous improvement of the environment;
  • Contribute to the evolution of SRE practices and the DevOps culture in day-to-day operations.

Requirements

  • Previous experience working as an SRE, NOC, or Command Center analyst;
  • Knowledge of microservices architecture;
  • Experience with CI/CD pipelines and practices;
  • Knowledge of Kubernetes;
  • Experience with AWS cloud;
  • Experience with monitoring and troubleshooting tools, such as Dynatrace;
  • Knowledge of Linux operating systems;
  • Experience with DevOps culture and SRE practices;
  • Experience in incident management and log analysis;
  • Strong analytical and problem-solving skills;
  • Clear communication for interaction with technical teams and stakeholders;
  • Bachelor's degree completed.
  • Desirable: Experience with ITSM tools (e.g., ServiceNow); experience in high-availability and mission-critical environments; experience automating operational routines; knowledge of advanced observability practices (metrics, logs, and traces); experience leading critical incidents and crisis management.
Benefits
  • Hybrid model (2 in-office days per week).
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
microservices architectureCI/CD pipelinesKubernetesAWS cloudLinux operating systemsincident managementlog analysisobservability practicestroubleshooting toolsDevOps practices
Soft Skills
analytical skillsproblem-solving skillsclear communicationcollaborationleadershipstakeholder managementproactive incident detectioncoordinationcontinuous improvementcrisis management