
SRE Specialist I
Inmetrics
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇧🇷 Brazil
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AWSAzureCloudGoGoogle Cloud PlatformGrafanaJenkinsKubernetesPrometheusPythonSplunkTerraform
About the role
- Technical Leadership and Best Practices: Serve as a technical reference for the team, supporting development and promoting Site Reliability Engineering (SRE) best practices.
- Availability and Performance: Ensure the availability, scalability, performance, and security of the company’s systems and infrastructure.
- Maintain a stable, reliable, and secure environment for all users and services.
- DevOps Culture and Integration: Foster a DevOps culture, encouraging collaboration between development, infrastructure, and information security teams.
- Automation and Monitoring: Implement and manage tools and processes for automation, monitoring, and orchestration of infrastructure and applications.
- Incident Management and Continuous Improvement: Analyze incidents, identify root causes, and propose preventive solutions to avoid recurrence.
- Related Activities: Perform other duties inherent to the role, contributing to the efficiency and continuous evolution of services and processes.
Requirements
- Infrastructure as Code (IaC): Strong proficiency with Terraform (preferred), with knowledge of Pulumi or CloudFormation.
- APIs: Experience and expertise with Apigee.
- Observability: Experience with Datadog (preferred), and familiarity with tools such as Dynatrace, Splunk, Prometheus, Grafana, or ChaosSearch.
- Cloud Automation: Programming experience in Python or Golang.
- CI/CD: Experience with continuous integration and delivery tools such as GitHub, Jenkins, or Argo CD.
- Cloud: Senior-level experience with Google Cloud Platform (GCP) — AWS or Azure also accepted.
- Managed Kubernetes: Advanced expertise in GKE (preferred), with knowledge of EKS or AKS.
- Cloud, DevOps, or Kubernetes certifications will be considered a plus.
- Experience with cost optimization and governance in multi-cloud environments.
Benefits
- Bradesco Health Plan (30% copayment)
- Bradesco Dental (no employee contribution)
- Life insurance
- Wellhub (Gympass)
- Childcare allowance
- Allowance for children with special needs
- Payroll-deductible loan (consigned credit)
- Private pension plan
- Pet care benefits
- SESC benefits access
- Conexa telemedicine
- Financial assistance
- Meal/Food allowance
- Multi-benefits card
- Option to upgrade health plan
- Extended maternity and paternity leave
- Support program for pregnant employees
- Newborn kit and the book "Acontecia quando eu nascia" ("It Happened When I Was Born")
- Professional development: courses available through the internal university
- 100% remote or hybrid work arrangements, depending on project applicability.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
TerraformPulumiCloudFormationApigeeDatadogDynatraceSplunkPrometheusGrafanaPython
Soft skills
technical leadershipcollaborationincident managementcontinuous improvement
Certifications
Cloud certificationsDevOps certificationsKubernetes certifications