Principal Site Reliability Engineer

Fidelity Investments

full-time

Posted on: 3/10/2026

Location Type: Hybrid

✨ AI Apply

About the role

Design and maintain highly available cloud infrastructure using Infrastructure as Code (AWS CDK, Terraform, CloudFormation)
Build and optimize CI/CD pipelines for automated testing, deployment, and monitoring of applications
Implement and manage containerized applications using ECS, Docker, and Kubernetes with focus on reliability and performance
Monitor system performance, availability, and security across all environments using observability tools
Collaborate with enterprise DBA teams to support YugabyteDB database operations, performance tuning, and disaster recovery
Automate operational tasks, implement backup/disaster recovery procedures, and establish SLAs/SLOs
Participate in on-call rotation, incident response, and post-mortem analysis to drive continuous improvement
Ensure compliance with financial regulations and security best practices while mentoring team members

5+ years of Site Reliability Engineering or DevOps experience with cloud platforms (AWS, Azure, or GCP) including compute, storage, networking, and managed services
Proficiency with Infrastructure as Code tools (AWS CDK, Terraform, CloudFormation) and scripting languages (TypeScript for CDK, Python, Bash, PowerShell)
Experience with database administration concepts and distributed databases, preferably YugabyteDB or similar (PostgreSQL, CockroachDB)
Experience with Liquibase, Flyway or similar tools for managing database schema changes
Strong understanding of cloud networking, security groups, VPCs, load balancers, and DNS management
Experience building and maintaining CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, or Azure DevOps
Knowledge of monitoring and observability tools (Prometheus, Grafana, Datadog, CloudWatch) and incident management practices
Strong Linux system administration skills and containerization experience (ECS, Docker, Kubernetes)
Excellent problem-solving skills with ability to troubleshoot complex distributed systems and work independently

Benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

AWS CDKTerraformCloudFormationCI/CDECSDockerKubernetesYugabyteDBTypeScriptPython

Soft Skills

problem-solvingindependent workmentoringcollaborationincident responsecontinuous improvement