Xcel Energy

ML/AI Ops Engineer

Xcel Energy

full-time

Posted on:

Location Type: Office

Location: DenverColoradoMinnesotaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $112,200 - $159,400 per year

Job Level

Tech Stack

About the role

  • Lead and support solution lifecycle technical activities
  • Ensure solutions are designed for great user experience and operational performance
  • Lead design, ensuring Enterprise Architecture, Security, Operations and Compliance aspects are continuously integrated into solutions
  • Provide input to cost and schedule estimation
  • Responsible for overall integrity of system design and operation
  • Oversee vendor activities
  • Conduct peer reviews and approve system changes and technical solution design
  • Coach and mentor less experienced team members
  • Partner cross-organizationally to drive minimal costs on optimal solutions
  • Provide in-depth technical information to stakeholders as needed
  • Innovate through usage of industry emerging capabilities and evolving customer needs
  • Provide input to strategic roadmap and technical dependencies
  • Continuously stay current on, and apply, technical industry knowledge pertaining to the respective domain
  • Review solution performance and continually assess health of systems
  • Track and drive awareness to operational and technical debt risks
  • Provide escalated support to incident and problem management
  • Utilize analytics to improve availability, reliability, efficiency and capacity
  • Productionize machine learning and AI models, including classical ML and GenAI, using standardized MLOps pipelines
  • Manage end-to-end model lifecycle activities: versioning, promotion, rollback, retraining, and retirement
  • Implement CI/CD practices for models, features, and inference services
  • Design, build, and maintain reusable MLOps pipelines for training, validation, deployment, and monitoring
  • Develop common components (feature pipelines, quality checks, evaluation harnesses) to reduce friction across AI projects
  • Implement monitoring for model performance, data drift, bias, and system health
  • Own AI/ML operational SLAs, SLOs, and incident response, including root-cause analysis and post-mortems
  • Ensure high availability, resilience, and recoverability of AI services
  • Support regulated or high-risk AI use cases by embedding governance, validation, and documentation into MLOps workflows
  • Produce and maintain required artifacts such as model cards, system cards, validation evidence, and audit support materials
  • Partner closely with AI Governance and Risk teams to ensure alignment with enterprise standards

Requirements

  • Ten years of related functional experience
  • Bachelor's degree in Technology, Science, Business or related field, or 4 years of experience equivalent to the position
  • Excellent communication skills
  • Excellent Relationship Management and collaboration skills
  • Expertise managing the lifecycle of technical solutions
  • Deep Subject Matter Expertise within the respective system domain products, platforms, processes and architecture
  • Broad and deep knowledge of technology architecture, infrastructure, network, security and software principles and models
  • Experience working in partnership with internal and external vendors
  • Excellent analytical, problem-solving and troubleshooting skills
  • Extensive knowledge of future technology trends within area of expertise
  • Demonstrated leadership on technical aspects of large-scale projects
  • Experience coaching other developers in system deployment or operational troubleshooting
  • Experience with delivery methodologies (Waterfall, Agile, Scrum) and operational models (ITIL)
  • Experience and understanding of core IT Service Management functions, such as Change Management and Incident Management
Benefits
  • Annual Incentive Program
  • Medical/Pharmacy Plan
  • Dental
  • Vision
  • Life Insurance
  • Dependent Care Reimbursement Account
  • Health Care Reimbursement Account
  • Health Savings Account (HSA) (if enrolled in eligible health plan)
  • Limited-Purpose FSA (if enrolled in eligible health plan and HSA)
  • Transportation Reimbursement Account
  • Short-term disability (STD)
  • Long-term disability (LTD)
  • Employee Assistance Program (EAP)
  • Fitness Center Reimbursement (if enrolled in eligible health plan)
  • Tuition reimbursement
  • Transit programs
  • Employee recognition program
  • Pension
  • 401(k) plan
  • Paid time off (PTO)
  • Holidays
  • Volunteer Paid Time Off (VPTO)
  • Parental Leave
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
MLOpsmachine learningAI modelsCI/CD practicessystem designanalyticsoperational performancetechnical solution designcost estimationincident management
Soft Skills
communication skillsrelationship managementcollaboration skillsanalytical skillsproblem-solving skillstroubleshooting skillsleadershipcoachingmentoringcross-organizational partnership