
Senior Infrastructure Analyst, ePROC
INFOX Tecnologia da Informação
full-time
Posted on:
Location Type: Hybrid
Location: São Paulo • Brazil
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Reliability and Availability: Design, implement, and maintain highly reliable and available systems and infrastructure, defining and monitoring Service Level Objectives (SLOs) and Service Level Indicators (SLIs);
- Incident Management: Actively participate in incident resolution, conduct root cause analyses (post-mortems), and implement corrective and preventive actions;
- Automation: Develop and implement automation solutions for repetitive operational tasks such as infrastructure provisioning, deployments, testing, and monitoring;
- Monitoring and Observability: Implement and maintain comprehensive monitoring and observability tools to identify performance issues, bottlenecks, and trends, ensuring visibility into system health;
- Capacity Management: Monitor system and infrastructure capacity, planning and implementing scalability strategies to ensure adequate performance under varying workloads;
- Performance Optimization: Identify and implement performance optimizations in applications and infrastructure, aiming for maximum efficiency and the best user experience;
- Infrastructure as Code (IaC): Use IaC tools (e.g., ArgoCD, Terraform, CloudFormation, Ansible) to provision and manage infrastructure in an automated and versioned manner;
- SRE Culture: Promote and disseminate SRE principles and practices within the team and organization;
- Collaboration: Work closely with development, operations, and security teams to ensure continuous and reliable delivery of software and services;
- Documentation: Create and maintain clear and concise documentation on infrastructure architecture, processes, and procedures;
- Security: Implement and maintain security practices in the infrastructure, following company policies and industry best practices;
- Continuous Improvement: Continuously seek new technologies and approaches to improve infrastructure reliability, efficiency, and security.
Requirements
- Bachelor's degree in Computer Science, Computer Engineering, Information Systems, or a related field.
- Proven experience in IT operations roles.
- Experience in mission-critical, high-availability environments.
- Strong knowledge of the PHP ecosystem, with emphasis on configuration, tuning, and troubleshooting of application servers (PHP-FPM, Nginx/Apache) and code performance analysis.
- Strong knowledge of Linux operating systems.
- Strong knowledge of MySQL database configuration, tuning, and troubleshooting.
- Advanced proficiency with cloud platforms (OCI, AWS, or GCP).
- Solid experience with container orchestration on Kubernetes and cluster management using Rancher and/or OpenShift.
- Experience with monitoring and observability tools (Prometheus, Grafana, Zabbix, ELK Stack, etc.).
- Expertise in infrastructure automation tools (IaC).
- Advanced knowledge of networking concepts and communication protocols (TCP/IP, DNS, HTTP, Site-to-Site VPN, Direct Connect/FastConnect, WAF (Web Application Firewall), and Layer 7 load balancing).
- Hands-on experience with DevOps practices and agile methodologies.
- Nice to have: cloud platform certifications (AWS Certified DevOps Engineer, Oracle OCI, Azure DevOps Engineer Expert, GCP Cloud DevOps Engineer).
- Advanced English for diagnostics and troubleshooting (considered a plus).
- Knowledge of the legal sector or experience with solutions for the judiciary.
Benefits
- Meal allowance or meal voucher.
- Group Bradesco Health Plan to support you and your family's well-being.
- Dental plan to keep you smiling.
- Group life insurance (Bradesco).
- Transportation voucher for days in the office.
- Partnership with SESC.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Infrastructure as Code (IaC)PHPLinuxMySQLCloud platformsKubernetesContainer orchestrationMonitoring toolsNetworking conceptsDevOps practices
Soft Skills
CollaborationIncident ManagementContinuous ImprovementDocumentationPerformance OptimizationCapacity ManagementAutomationSRE CultureProblem-solvingRoot cause analysis
Certifications
AWS Certified DevOps EngineerOracle OCIAzure DevOps Engineer ExpertGCP Cloud DevOps Engineer