FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Site Reliability Engineer – SRE
Attus Procuradoria DigitalSite Reliability Engineer ensuring reliability and performance of critical systems at Attus. Focused on innovating public advocacy processes with reliability practices.
Tech Stack
Tools & technologiesAnsibleDNSDockerElasticSearchGrafanaKafkaKubernetesLinuxPrometheusPythonRedisTerraform
About the role
Key responsibilities & impact- Define and track reliability indicators (SLI, SLO, SLA) and operate based on Error Budget;
- Establish high availability, resilience and disaster recovery strategies (RTO/RPO);
- Conduct capacity planning and service performance analysis;
- Work on the reliability and performance of applications running on Kubernetes;
- Design and evolve system observability (logs, metrics, traces and alerts);
- Create dashboards and alerts focused on visibility and action, reducing noise and false positives;
- Detect issues before customers do by instrumenting services;
- Establish and run the incident response process (classification, severity, on-call);
- Lead or support troubleshooting of applications and distributed environments;
- Perform root cause analysis (RCA) and post-mortems, proposing preventive measures;
- Develop and maintain operational runbooks;
- Automate operational tasks and incident responses (self-healing), eliminating repetitive manual work;
- Use AI for log analysis, anomaly detection, troubleshooting and optimization (AIOps);
- Continuously pursue the principle “automate before repeating,” advancing operational maturity;
- Collaborate with development and platform teams to continuously improve reliability;
- Promote a culture of reliability and best practices across teams;
- Apply security best practices in production environments (secrets, access control, segregation);
- Ensure traceability (logs, auditing and events);
- Support compliance with standards such as ISO 27001 and DevSecOps practices;
- Integrate reliability and security (Security by Design).
Requirements
What you’ll need- Experience or familiarity with observability tools (Grafana, Prometheus, Elastic, Dynatrace or similar)
- Experience or familiarity with Kubernetes and containers (Docker)
- Knowledge of Linux and networking (HTTP, DNS, TLS/SSL)
- Knowledge of scripting and automation (Shell, Python or similar)
- Analytical skills and a strong problem-solving focus
- Regular use of AI in daily work and an automation mindset ("automate before repeating")
- Organized, autonomous profile with strong technical communication — comparable to a mid/senior Full Stack Developer with production projects
- Quick learner
- Continuous desire to learn
- Empathy for customer logic
- Focus on delivering the best customer experience
- Collaborative mindset; able to offer and ask for help
- Strong communication skills to interact with different areas
- Proactive and well-organized
- Alignment with our values: Honesty and Ethics; Excellence and Care in Deliverables; Recognition; Respect and Courtesy
- Experience with SLI, SLO and Error Budget
- Experience troubleshooting distributed systems
- Experience with critical, high-availability environments
- Experience with APM tools (Dynatrace, Datadog)
- Knowledge of OpenTelemetry and instrumentation
- Knowledge of Kafka, Elasticsearch or Redis
- Experience with incident automation (self-healing) and IaC (Terraform, Ansible)
- Knowledge of Chaos Engineering and service mesh
- Experience applying AI to operations (AIOps, technical copilots)
- Experience in regulated environments (government, legal or financial)
Benefits
Comp & perks- Health plan: Comprehensive care for your health.
- Life insurance: Security and peace of mind for you and your family.
- Partner discounts: Access to pharmacies, nutritionists and psychologists with special conditions.
- Well-being app (Clude): Encouragement for physical activities and well-being.
- Total Pass: Access to a wide network of nearby gyms.
- Workplace exercise: Active breaks to care for your body during work.
- Meal allowance: For CLT employment contracts.
- Caju Card: A special gift for your birthday month.
- Home office allowance: Support to set up a comfortable and productive workspace.
- Education assistance: Support for your academic and professional development.
- Book allowance: Encouragement to expand your knowledge.
- Continuous development: Programs and initiatives to boost your career.
- Innovation program: A space for you to bring ideas and make a difference.
- Dual screen: Proper tools for improved productivity.
- 100% remote position: Work from where you feel best.
- FreeDay
- Moment Off: We encourage breaks for disconnection and rest.
- Time off for your graduation: We celebrate your achievements with you.
- Gift for new children of employees: A token to celebrate the arrival of a new family member.
- Welcome-back gift after paternity leave: Support upon returning from this important phase.
- Supportive and collaborative environment: A team that helps and grows together.
- Eco-friendly welcome kit: Start your journey with us sustainably.
- Sustainable culture: Practical actions such as promoting composting.
- Virtual social gatherings: Moments to celebrate and connect with the team.
- Ongoing engagement campaigns throughout the year.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesDockerLinuxHTTPDNSTLS/SSLShellPythonTerraformAnsible
Soft Skills
analytical skillsproblem-solvingtechnical communicationquick learnercontinuous desire to learnempathycollaborative mindsetstrong communication skillsproactivewell-organized
Certifications
ISO 27001