
SRE / DevOps Specialist
Compass
full-time
Posted on:
Location Type: Remote
Location: Remote • Brasil
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
CassandraKubernetesLinuxMongoDBOpenShiftOraclePostgresPython
About the role
- Perform troubleshooting and functional analysis of incidents in non-production environments;
- Provide support for applications in test environments;
- Implement and manage monitoring tools, ensuring visibility into system performance and proactive issue detection;
- Lead incident response, conducting post-incident analyses (postmortems) to identify root causes and prevent recurrence;
- Develop scripts and automation tools for repetitive tasks, increasing operational efficiency and reducing human error;
- Analyze system capacity and plan for scalability, ensuring service availability and performance;
- Collaborate with development teams to implement changes safely and efficiently, minimizing impacts on staging environments;
- Work alongside security teams, ensuring security practices are integrated into the testing lifecycle;
- Create and maintain technical documentation and operational runbooks, and support team training on best practices and tools;
- Partner with QA analysts to promote continuous improvement of system reliability and efficiency.
Requirements
- Advanced knowledge of Kubernetes;
- Strong experience with Linux operating systems;
- Experience with cloud infrastructure;
- Experience with observability (metrics, logs, and tracing);
- Knowledge of SRE and DevOps practices;
- Experience with observability and monitoring tools, especially the ELK Stack, including creating dashboards for monolithic and distributed systems;
- Knowledge of operating infrastructure in distributed environments;
- Experience building automations using Python and Shell Script;
- Experience troubleshooting applications and infrastructure with a focus on root cause identification;
- Basic knowledge of relational and non-relational databases (Oracle, MongoDB, Cassandra, and PostgreSQL);
- Nice-to-haves: Experience operating applications on OpenShift and microservices architectures; knowledge and operation of Axway API Gateway.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
KubernetesLinuxcloud infrastructureobservabilityELK StackPythonShell Scriptrelational databasesnon-relational databasesroot cause analysis
Soft skills
troubleshootingfunctional analysisincident responsecollaborationtechnical documentationtrainingcontinuous improvement