FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal AI Engineer – Agent Ops, SRE
The HartfordPrincipal AI Engineer focused on building scalable AI solutions at Hartford. Collaborate across teams to ensure reliability in AI systems and support their lifecycle.
Posted 6/12/2026full-timeConnecticut, Illinois, North Carolina, Ohio • 🇺🇸 United StatesLead💰 $168,400 - $220,000 per yearWebsite
Tech Stack
Tools & technologiesAWSCloudDockerPythonSplunkTerraform
About the role
Key responsibilities & impact- Serve as technical liaison between AI COE and Platform Engineering & Enterprise SRE teams
- Ensure AI systems meet requirements for performance, latency, throughput, resiliency, recovery, observability and reliability
- Partner with AI engineers, Applied AI Scientist, and AI Architects to design, build and maintain scalable, fault tolerant AI systems as per SLO
- Partner with Platform engineering team to design and implement CICD, GITOps, and IAC (Terraform) modules
- Making sure we use our AgentOps NSA, standards, Ref. architecture and tooling
- Partner with enterprise release management and AI Governance team to build & deploy AI solutions using their platform tooling
- Supporting entire AI lifecycle as per the standard work template
- Build standardized deployment templates, reference architecture, automation scripts, terraform modules, CICD pipelines, and operational runbooks for AI workloads
- Design and build IDP (Harness) catalogs, templates & pipelines partnering with enterprise platform engineering team
- Manage production systems to ensure our enterprise SLOs are met
- Manage incident response for production systems, including triaging, escalating, RCA and implementing corrective actions
Requirements
What you’ll need- Bachelor's degree in Computer Science, Computer Engineering, or a technical field
- 10+ years building and shipping software and/or platform solutions for enterprises
- Programming experience with Python is required
- 3+ years of experience with IAC (Terraform)
- 5+ years of experience owning production CICD, GitOps and release management gating
- 3+ years of experience in implementing observability, performance & reliability solutions: SLO, P99-95 latency, alert tuning, & dashboards
- Experience with AI observability/monitoring tools such as Dynatrace, Splunk, Arize & OpenTelemetry/OpenInference is must
- Proven experience with Google's Gemini Enterprise Agent platform is a plus
- Experience with GKE/Docker/Registry is a plus
- Proven experience in working with other cloud providers such as AWS cloud is a plus
- Experience with Automated Testing, Automated Deployments, Agile methodologies, Unit Testing, and Integration Testing tools
- Conversational UX/UI design (multi-turn chatbots) and Human-Agent-Interaction (HAI) is a plus
- Experience with IR, vector embedding, and Hybrid/Semantic search technologies
- Experience with LLM orchestration frameworks like Langchain, LlamaIndex, LangSmith, LangGraph, Google Agent Development Kit, is a plus
- Experience with Generative AI Guardrails, responsible AI, adversarial attack mitigation, and red teaming is a plus
- Foundational understanding of Natural Language Processing and Deep Learning
- Excellent problem-solving skills and the ability to work in a collaborative team environment
- Excellent communication skills.
Benefits
Comp & perks- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
- Professional development opportunities
- Remote work options
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonIACTerraformCICDGitOpsobservabilityperformance solutionsreliability solutionsAutomated TestingNatural Language Processing
Soft Skills
problem-solvingcollaborativecommunication