
SRE Analyst – Mid
Compass
full-time
Posted on:
Location Type: Hybrid
Location: Barueri • Brasil
Visit company websiteExplore more
Tech Stack
About the role
- Continuously monitor the production environment, tracking ticket queues and alerts via management tools;
- Analyze and respond to Level 1 incidents, ensuring rapid identification and initial remediation of issues;
- Execute operational procedures such as service restarts and environment recovery actions;
- Perform log and metric analysis using observability tools to diagnose failures;
- Act proactively to detect incidents and service degradation;
- Set up and lead war rooms, coordinating communication and actions to resolve critical incidents;
- Escalate incidents to internal teams and vendors when necessary;
- Track and ensure follow-up on incidents, keeping stakeholders updated, including executive levels;
- Support the stability and availability of microservices-based applications and distributed environments;
- Collaborate with development and operations teams to resolve problems and drive continuous improvement of the environment;
- Contribute to the evolution of SRE practices and the DevOps culture in day-to-day operations.
Requirements
- Previous experience working as an SRE, NOC, or Command Center analyst;
- Knowledge of microservices architecture;
- Experience with CI/CD pipelines and practices;
- Knowledge of Kubernetes;
- Experience with AWS cloud;
- Experience with monitoring and troubleshooting tools, such as Dynatrace;
- Knowledge of Linux operating systems;
- Experience with DevOps culture and SRE practices;
- Experience in incident management and log analysis;
- Strong analytical and problem-solving skills;
- Clear communication for interaction with technical teams and stakeholders;
- Bachelor's degree completed.
- Desirable: Experience with ITSM tools (e.g., ServiceNow); experience in high-availability and mission-critical environments; experience automating operational routines; knowledge of advanced observability practices (metrics, logs, and traces); experience leading critical incidents and crisis management.
Benefits
- Hybrid model (2 in-office days per week).
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
microservices architectureCI/CD pipelinesKubernetesAWS cloudLinux operating systemsincident managementlog analysisobservability practicestroubleshooting toolsDevOps practices
Soft Skills
analytical skillsproblem-solving skillsclear communicationcollaborationleadershipstakeholder managementproactive incident detectioncoordinationcontinuous improvementcrisis management