
Principal Site Reliability Engineer
Hewlett Packard Enterprise
full-time
Posted on:
Location Type: Hybrid
Location: San Juan • Puerto Rico
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Enhance Infrastructure as Code (IAC) and enforce best practices.
- Optimize cloud infrastructure for scalability, security, and cost-effectiveness.
- Develop internal tools to support and streamline cloud platform operations.
- Improve CI/CD pipelines and deployment workflows using FluxCD and Jenkins.
- Address container image vulnerabilities and standardize remediation processes.
- Build Amazon Machine Images (AMIs) aligned with CIS and STIG benchmarks.
- Strengthen monitoring, alerting, and observability using Prometheus, Grafana, and logging tools.
- Troubleshoot complex production issues to ensure system reliability and customer satisfaction.
- Fine-tune distributed systems such as Apache Kafka and Cassandra.
- Collaborate with development, security, and operations teams to align infrastructure with application needs.
Requirements
- Minimum of 10 years of hands-on experience in Infra Ops, Dev Ops, or Site Reliability Engineering (SRE).
- Proficiency with Linux systems, especially Debian-based distributions.
- Strong experience with cloud platforms such as AWS and GCP.
- Expertise in Infrastructure as Code tools like Terraform, Packer, and Ansible.
- Solid programming skills in Python and/or Golang.
- Deep understanding of containerization (Docker, Container) and orchestration tools (AWS EKS, GCP GKE).
- Experience with GitOps workflows.
- Proven track record in implementing and maintaining CI/CD pipelines.
- Strong background in security and familiarity with security programs.
- Experience with monitoring and logging tools (Prometheus, Grafana, ELK).
- Knowledge of both relational (SQL) and non-relational databases.
- Excellent problem-solving and debugging skills with a strong sense of ownership.
- Experience managing distributed systems like Apache Kafka and Cassandra.
- Effective communicator and collaborative team player.
Benefits
- Health & Wellbeing
- Personal & Professional Development
- Unconditional Inclusion
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Infrastructure as Codecloud infrastructure optimizationCI/CD pipelinesFluxCDJenkinscontainerizationDockerAWS EKSGCP GKEprogramming in Python
Soft Skills
problem-solvingdebuggingownershipcommunicationcollaboration