Creditas

Senior Site Reliability Engineer, Observability

Creditas

full-time

Posted on:

Location Type: Hybrid

Location: São PauloBrazil

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Telemetry Architecture: Design and maintain data pipelines for the observability "three pillars" (Logs, Metrics, and Tracing), ensuring end-to-end visibility from code to infrastructure.
  • SRE and SLO management: Lead the definition of SLIs (Service Level Indicators) and realistic SLOs (Service Level Objectives). You will help engineering teams understand their error budgets so business decisions are based on performance data rather than assumptions.
  • Alerting evolution: Combat alert fatigue by building intelligent alerting systems and high-fidelity dashboards in Grafana and Datadog, focusing on what truly impacts the end customer.
  • Optimization and FinOps: Use performance data (e.g., CPU/Memory usage vs. latency) to identify cloud waste and propose cost and scalability improvements (HPA/VPA in Kubernetes).
  • Instrumentation and consulting: Support development teams in instrumenting their applications (via OpenTelemetry or APM SDKs), ensuring that generated data is useful for troubleshooting complex incidents.

Requirements

  • Experience in Critical Environments: Strong background supporting high-scale platforms (thousands of requests per second) and robust production environments.
  • Proficiency with APM Tools: Hands-on experience with market-leading tools such as Datadog, Dynatrace, or New Relic.
  • Expertise in the Technical Stack: Deep knowledge of Prometheus (metrics collection), Grafana (visualization), OpenSearch/ElasticSearch (log management), and Kong (API Gateway).
  • SRE Mindset: Practical understanding of how observability reduces MTTR (Mean Time To Repair) and how to implement a blameless post-mortem culture based on technical evidence.
  • Cloud-Agnostic Focus: Experience with AWS, but a mindset toward solutions that avoid vendor lock-in.
  • Soft Skills: Ability to influence developers to prioritize observability by design and to clearly communicate technical indicators to product leadership.
  • Availability for hybrid work: Required to attend our office in the Morumbi area of São Paulo once a month for 4 consecutive days, typically during the last or first week of the month (Creditas in Person).
Benefits
  • Health Plan (Alice)
  • Dental Plan (SulAmérica)
  • Wellz: 100% free therapy sessions
  • Wellhub: access to gyms and studios
  • Creditas Endurance: high-impact sports incentive program
  • Pharmacy discount program (Univers)
  • Life Insurance (Porto Seguro)
  • Birthday day off
  • Extended Parental Leave: 6 months for birthing parents and 35 days for non-birthing parents
  • Family Care: support program for maternity and paternity
  • Childcare allowance
  • Assistance for dependents with disabilities (PWD)
  • SESC: access to SESC facilities for you and your dependents
  • Meal Voucher (VR): flexible benefits card (Creditas Card)
  • Payroll-deductible loan (Creditas Benefits)
  • Salary advance (Creditas Benefits)
  • Discounts on insurance (Minuto Seguros)
  • Access to exclusive financial education content in the Creditas app
  • PPR: profit-sharing program
  • Educational and professional development incentives
  • Flexible work model
  • Free bike parking at the office
  • Partnered office parking (subject to internal availability)
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data pipelinesSLIsSLOsalerting systemshigh-fidelity dashboardsperformance datainstrumentationOpenTelemetryAPM SDKsKubernetes
Soft Skills
influencecommunicationblameless post-mortem cultureprioritize observability