
Principal Reliability Engineer – EDS
The Hartford
full-time
Posted on:
Location Type: Hybrid
Location: Hartford • Connecticut • North Carolina • United States
Visit company websiteExplore more
Salary
💰 $152,800 - $229,200 per year
Job Level
Tech Stack
About the role
- Responsible for the reliability, resilience, availability, and performance of all data platforms, cloud infrastructure, data products, and data pipelines
- Define and implement the Reliability Engineering strategy for data platforms and cloud environments
- Influence architectural direction and lead large-scale, cross-organizational technical initiatives
- Develop and implement AI-driven automation for anomaly detection, alert correlation, autonomous remediation, and predictive capacity management
- Establish gold-standard incident response patterns and continuous improvement processes
- Ensure data quality, timeliness, and SLAs for data products through automated checks and reliability tooling
- Set and enforce standards for IaC, CI/CD, platform automation, and operational readiness across EDS
Requirements
- 10+ years in data, cloud, platform engineering, site/reliability engineering, or large-scale distributed systems
- Proficiency with data or cloud platforms, including architectural patterns for resilience, networking, security, and distributed data infrastructure
- Deep experience with Snowflake, EMR, Hadoop/Spark, Data Integration, and cloud-native data ecosystems
- Scripting and programming (preferably Python) for large-scale automation, platform tooling, and reliability frameworks
- Experience with Infrastructure-as-Code (Terraform, CloudFormation) and enterprise CI/CD
- Experience in regulated or highly complex enterprise environments (financial services, insurance, healthcare)
- Certifications in AWS, GCP, Kubernetes, or SRE/DevOps frameworks
- Background applying machine learning to operations—anomaly detection, event correlation, predictive modeling, and automated remediation
- Expertise with enterprise observability stacks (Prometheus, Grafana, Datadog, Splunk, Dynatrace, OpenTelemetry)
- Exceptional communication skills for interacting with executives, senior architects, product leaders, and engineering teams.
Benefits
- short-term or annual bonuses
- long-term incentives
- on-the-spot recognition
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
data engineeringcloud engineeringsite reliability engineeringlarge-scale distributed systemsdata integrationscriptingprogrammingInfrastructure-as-Codeautomationmachine learning
Soft Skills
communicationleadershipinfluencingcollaborationproblem-solvingcontinuous improvementincident responseorganizational skillstechnical initiative leadershipexecutive interaction
Certifications
AWSGCPKubernetesSREDevOps