Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Wells Fargo

Principal Engineer – AI Ops

Wells Fargo

Principal Engineer in AIOps at Wells Fargo, driving strategy and execution for Zero Touch Production capabilities. Collaborating with cross-functional teams to implement AI/ML solutions across the enterprise.

Posted 6/11/2026full-timeCharlotte • New Jersey, North Carolina • 🇺🇸 United StatesLead💰 $159,000 - $305,000 per yearWebsite

Tech Stack

Tools & technologies
AnsibleCloudDistributed SystemsKafkaKubernetesMicroservicesPrometheusPythonSDLCSplunkTerraform

About the role

Key responsibilities & impact
  • Lead the strategy, design, and execution of AIOps platforms and capabilities to enable Zero Touch Production across CCIBT
  • Define and drive enterprise-wide AIOps roadmap, including observability, event correlation, anomaly detection, predictive insights, and automated remediation
  • Architect and implement self-healing systems leveraging AI/ML, event-driven automation, and closed-loop workflows
  • Drive adoption of intelligent incident management, root cause analysis (RCA), noise reduction, and auto-resolution techniques
  • Establish target-state architecture and engineering standards for AIOps platforms, tooling, and integrations
  • Influence enterprise technology strategy by evaluating emerging AIOps trends, tools, and frameworks
  • Partner with SRE, infrastructure, cloud, and application teams to embed AIOps into SDLC, CI/CD, and production operations
  • Lead large-scale engineering initiatives with cross-functional and enterprise impact
  • Provide thought leadership on resilience engineering, reliability, automation, and production excellence
  • Mentor and guide senior engineers and teams on AIOps best practices, architecture, and implementation
  • Collaborate with risk, compliance, and governance teams to ensure secure, compliant, and auditable automation

Requirements

What you’ll need
  • 7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 5+ years of experience in AIOps, SRE, production engineering, or large-scale distributed systems operations
  • 4+ years of experience with Python, programming, or scripting languages
  • 2+ years of experience working with Generative AI, large language models (LLM), or foundation models
  • 2+ Agentic AI and Agent building experience
  • Experience with AI-powered development or GitHub Copilot
  • Proven experience designing and implementing observability, monitoring, and automation platforms at scale
  • Deep expertise in AIOps platforms and tools (e.g., Prometheus, AppDynamics, Splunk, ITRS Geneos, BigPanda, OpenTelemetry ecosystems)
  • Strong experience with AI/ML for IT operations, including anomaly detection, event correlation, forecasting, and intelligent alerting
  • Hands-on experience with automation frameworks (e.g., Ansible, Terraform, or similar) and event-driven architectures
  • Strong understanding of SRE principles, SLIs/SLOs, error budgets, and reliability engineering practices
  • Experience building self-healing systems and closed-loop remediation workflows
  • Proficiency in cloud platforms and cloud-native architectures (Kubernetes, microservices)
  • Knowledge of data pipelines, streaming platforms (Kafka), and telemetry ingestion/processing
  • Familiarity with GenAI/LLM-assisted operations, including incident summarization, knowledge mining, and automated runbook generation
  • Ability to operate across complex organizational structures with strong stakeholder management and communication skills
  • Proven ability to define target-state architecture, operating models, and actionable roadmaps
  • Ability to manage multiple high-complexity engineering initiatives with significant enterprise impact
  • Strong analytical, problem-solving, and architectural design skills
  • Excellent communication and documentation skills (e.g., Confluence, Git, architecture diagrams)
  • Comfortable driving transformation and influencing senior leadership in a fast-paced, evolving environment.

Benefits

Comp & perks
  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Scholarships for dependent children
  • Adoption reimbursement

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AIOpsPythonGenerative AIlarge language modelsAI/MLobservabilityautomationself-healing systemsevent-driven architecturescloud-native architectures
Soft Skills
stakeholder managementcommunication skillsanalytical skillsproblem-solving skillsarchitectural design skillsmentoringthought leadershipcollaborationinfluencingdriving transformation