The Hartford

Principal Reliability Engineer – EDS

The Hartford

full-time

Posted on:

Location Type: Hybrid

Location: HartfordConnecticutNorth CarolinaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $152,800 - $229,200 per year

Job Level

About the role

  • Responsible for the reliability, resilience, availability, and performance of all data platforms, cloud infrastructure, data products, and data pipelines
  • Define and implement the Reliability Engineering strategy for data platforms and cloud environments
  • Influence architectural direction and lead large-scale, cross-organizational technical initiatives
  • Develop and implement AI-driven automation for anomaly detection, alert correlation, autonomous remediation, and predictive capacity management
  • Establish gold-standard incident response patterns and continuous improvement processes
  • Ensure data quality, timeliness, and SLAs for data products through automated checks and reliability tooling
  • Set and enforce standards for IaC, CI/CD, platform automation, and operational readiness across EDS

Requirements

  • 10+ years in data, cloud, platform engineering, site/reliability engineering, or large-scale distributed systems
  • Proficiency with data or cloud platforms, including architectural patterns for resilience, networking, security, and distributed data infrastructure
  • Deep experience with Snowflake, EMR, Hadoop/Spark, Data Integration, and cloud-native data ecosystems
  • Scripting and programming (preferably Python) for large-scale automation, platform tooling, and reliability frameworks
  • Experience with Infrastructure-as-Code (Terraform, CloudFormation) and enterprise CI/CD
  • Experience in regulated or highly complex enterprise environments (financial services, insurance, healthcare)
  • Certifications in AWS, GCP, Kubernetes, or SRE/DevOps frameworks
  • Background applying machine learning to operations—anomaly detection, event correlation, predictive modeling, and automated remediation
  • Expertise with enterprise observability stacks (Prometheus, Grafana, Datadog, Splunk, Dynatrace, OpenTelemetry)
  • Exceptional communication skills for interacting with executives, senior architects, product leaders, and engineering teams.
Benefits
  • short-term or annual bonuses
  • long-term incentives
  • on-the-spot recognition
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data engineeringcloud engineeringsite reliability engineeringlarge-scale distributed systemsdata integrationscriptingprogrammingInfrastructure-as-Codeautomationmachine learning
Soft Skills
communicationleadershipinfluencingcollaborationproblem-solvingcontinuous improvementincident responseorganizational skillstechnical initiative leadershipexecutive interaction
Certifications
AWSGCPKubernetesSREDevOps