Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
The Hartford

Principal AI Engineer – Agent Ops, SRE

The Hartford

Principal AI Engineer focused on building scalable AI solutions at Hartford. Collaborate across teams to ensure reliability in AI systems and support their lifecycle.

Posted 6/12/2026full-timeConnecticut, Illinois, North Carolina, Ohio • 🇺🇸 United StatesLead💰 $168,400 - $220,000 per yearWebsite

Tech Stack

Tools & technologies
AWSCloudDockerPythonSplunkTerraform

About the role

Key responsibilities & impact
  • Serve as technical liaison between AI COE and Platform Engineering & Enterprise SRE teams
  • Ensure AI systems meet requirements for performance, latency, throughput, resiliency, recovery, observability and reliability
  • Partner with AI engineers, Applied AI Scientist, and AI Architects to design, build and maintain scalable, fault tolerant AI systems as per SLO
  • Partner with Platform engineering team to design and implement CICD, GITOps, and IAC (Terraform) modules
  • Making sure we use our AgentOps NSA, standards, Ref. architecture and tooling
  • Partner with enterprise release management and AI Governance team to build & deploy AI solutions using their platform tooling
  • Supporting entire AI lifecycle as per the standard work template
  • Build standardized deployment templates, reference architecture, automation scripts, terraform modules, CICD pipelines, and operational runbooks for AI workloads
  • Design and build IDP (Harness) catalogs, templates & pipelines partnering with enterprise platform engineering team
  • Manage production systems to ensure our enterprise SLOs are met
  • Manage incident response for production systems, including triaging, escalating, RCA and implementing corrective actions

Requirements

What you’ll need
  • Bachelor's degree in Computer Science, Computer Engineering, or a technical field
  • 10+ years building and shipping software and/or platform solutions for enterprises
  • Programming experience with Python is required
  • 3+ years of experience with IAC (Terraform)
  • 5+ years of experience owning production CICD, GitOps and release management gating
  • 3+ years of experience in implementing observability, performance & reliability solutions: SLO, P99-95 latency, alert tuning, & dashboards
  • Experience with AI observability/monitoring tools such as Dynatrace, Splunk, Arize & OpenTelemetry/OpenInference is must
  • Proven experience with Google's Gemini Enterprise Agent platform is a plus
  • Experience with GKE/Docker/Registry is a plus
  • Proven experience in working with other cloud providers such as AWS cloud is a plus
  • Experience with Automated Testing, Automated Deployments, Agile methodologies, Unit Testing, and Integration Testing tools
  • Conversational UX/UI design (multi-turn chatbots) and Human-Agent-Interaction (HAI) is a plus
  • Experience with IR, vector embedding, and Hybrid/Semantic search technologies
  • Experience with LLM orchestration frameworks like Langchain, LlamaIndex, LangSmith, LangGraph, Google Agent Development Kit, is a plus
  • Experience with Generative AI Guardrails, responsible AI, adversarial attack mitigation, and red teaming is a plus
  • Foundational understanding of Natural Language Processing and Deep Learning
  • Excellent problem-solving skills and the ability to work in a collaborative team environment
  • Excellent communication skills.

Benefits

Comp & perks
  • Health insurance
  • 401(k) matching
  • Flexible work hours
  • Paid time off
  • Professional development opportunities
  • Remote work options

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonIACTerraformCICDGitOpsobservabilityperformance solutionsreliability solutionsAutomated TestingNatural Language Processing
Soft Skills
problem-solvingcollaborativecommunication