Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Attus Procuradoria Digital

Site Reliability Engineer – SRE

Attus Procuradoria Digital

Site Reliability Engineer ensuring reliability and performance of critical systems at Attus. Focused on innovating public advocacy processes with reliability practices.

Posted 6/8/2026full-timeRemote • 🇧🇷 BrazilMid-LevelSeniorWebsite

Tech Stack

Tools & technologies
AnsibleDNSDockerElasticSearchGrafanaKafkaKubernetesLinuxPrometheusPythonRedisTerraform

About the role

Key responsibilities & impact
  • Define and track reliability indicators (SLI, SLO, SLA) and operate based on Error Budget;
  • Establish high availability, resilience and disaster recovery strategies (RTO/RPO);
  • Conduct capacity planning and service performance analysis;
  • Work on the reliability and performance of applications running on Kubernetes;
  • Design and evolve system observability (logs, metrics, traces and alerts);
  • Create dashboards and alerts focused on visibility and action, reducing noise and false positives;
  • Detect issues before customers do by instrumenting services;
  • Establish and run the incident response process (classification, severity, on-call);
  • Lead or support troubleshooting of applications and distributed environments;
  • Perform root cause analysis (RCA) and post-mortems, proposing preventive measures;
  • Develop and maintain operational runbooks;
  • Automate operational tasks and incident responses (self-healing), eliminating repetitive manual work;
  • Use AI for log analysis, anomaly detection, troubleshooting and optimization (AIOps);
  • Continuously pursue the principle “automate before repeating,” advancing operational maturity;
  • Collaborate with development and platform teams to continuously improve reliability;
  • Promote a culture of reliability and best practices across teams;
  • Apply security best practices in production environments (secrets, access control, segregation);
  • Ensure traceability (logs, auditing and events);
  • Support compliance with standards such as ISO 27001 and DevSecOps practices;
  • Integrate reliability and security (Security by Design).

Requirements

What you’ll need
  • Experience or familiarity with observability tools (Grafana, Prometheus, Elastic, Dynatrace or similar)
  • Experience or familiarity with Kubernetes and containers (Docker)
  • Knowledge of Linux and networking (HTTP, DNS, TLS/SSL)
  • Knowledge of scripting and automation (Shell, Python or similar)
  • Analytical skills and a strong problem-solving focus
  • Regular use of AI in daily work and an automation mindset ("automate before repeating")
  • Organized, autonomous profile with strong technical communication — comparable to a mid/senior Full Stack Developer with production projects
  • Quick learner
  • Continuous desire to learn
  • Empathy for customer logic
  • Focus on delivering the best customer experience
  • Collaborative mindset; able to offer and ask for help
  • Strong communication skills to interact with different areas
  • Proactive and well-organized
  • Alignment with our values: Honesty and Ethics; Excellence and Care in Deliverables; Recognition; Respect and Courtesy
  • Experience with SLI, SLO and Error Budget
  • Experience troubleshooting distributed systems
  • Experience with critical, high-availability environments
  • Experience with APM tools (Dynatrace, Datadog)
  • Knowledge of OpenTelemetry and instrumentation
  • Knowledge of Kafka, Elasticsearch or Redis
  • Experience with incident automation (self-healing) and IaC (Terraform, Ansible)
  • Knowledge of Chaos Engineering and service mesh
  • Experience applying AI to operations (AIOps, technical copilots)
  • Experience in regulated environments (government, legal or financial)

Benefits

Comp & perks
  • Health plan: Comprehensive care for your health.
  • Life insurance: Security and peace of mind for you and your family.
  • Partner discounts: Access to pharmacies, nutritionists and psychologists with special conditions.
  • Well-being app (Clude): Encouragement for physical activities and well-being.
  • Total Pass: Access to a wide network of nearby gyms.
  • Workplace exercise: Active breaks to care for your body during work.
  • Meal allowance: For CLT employment contracts.
  • Caju Card: A special gift for your birthday month.
  • Home office allowance: Support to set up a comfortable and productive workspace.
  • Education assistance: Support for your academic and professional development.
  • Book allowance: Encouragement to expand your knowledge.
  • Continuous development: Programs and initiatives to boost your career.
  • Innovation program: A space for you to bring ideas and make a difference.
  • Dual screen: Proper tools for improved productivity.
  • 100% remote position: Work from where you feel best.
  • FreeDay
  • Moment Off: We encourage breaks for disconnection and rest.
  • Time off for your graduation: We celebrate your achievements with you.
  • Gift for new children of employees: A token to celebrate the arrival of a new family member.
  • Welcome-back gift after paternity leave: Support upon returning from this important phase.
  • Supportive and collaborative environment: A team that helps and grows together.
  • Eco-friendly welcome kit: Start your journey with us sustainably.
  • Sustainable culture: Practical actions such as promoting composting.
  • Virtual social gatherings: Moments to celebrate and connect with the team.
  • Ongoing engagement campaigns throughout the year.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesDockerLinuxHTTPDNSTLS/SSLShellPythonTerraformAnsible
Soft Skills
analytical skillsproblem-solvingtechnical communicationquick learnercontinuous desire to learnempathycollaborative mindsetstrong communication skillsproactivewell-organized
Certifications
ISO 27001