EY

Senior AIOps Engineer

EY

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Architect and implement enterprise-grade AIOps solutions to automate incident detection, root cause analysis, and remediation across cloud and hybrid environments.
  • Lead the integration of telemetry data from tools like Prometheus, Grafana, AppDynamics, Dynatrace, and Azure Monitor into centralized AIOps platforms for unified observability and intelligent event correlation.
  • Design and maintain ML models in Python for anomaly detection, predictive analytics, and operational forecasting, ensuring scalability and accuracy.
  • Build and optimize real-time and batch data pipelines using Apache Kafka, Logstash, and Fluentd to process logs, metrics, and traces from distributed systems.
  • Collaborate with DevOps, SRE, and platform engineering teams to embed AIOps capabilities into CI/CD workflows and infrastructure-as-code practices.
  • Drive automation of operational tasks and remediation workflows using Python, Azure Functions, and orchestration tools to enable self-healing systems.
  • Develop dashboards and visualizations using Grafana, Kibana, or Power BI to deliver actionable insights to engineering, operations, and business teams.
  • Implement alert noise reduction strategies using ML-based filtering, deduplication, and suppression techniques to improve signal-to-noise ratio.
  • Ensure compliance with security, governance, and audit policies, embedding DevSecOps principles and aligning with regulatory standards.
  • Lead technical evaluations of AIOps platforms and tools, making recommendations for adoption based on business needs and operational maturity.
  • Manage and mentor a team of AIOps engineers, fostering a culture of innovation, continuous learning, and operational excellence.
  • Partner with business stakeholders to identify opportunities for cost optimization, risk reduction, and service reliability improvements through AIOps.
  • Contribute to strategic roadmaps, budget planning, and vendor assessments, aligning AIOps initiatives with broader IT and business goals.
  • Stay current with emerging trends in AIOps, observability, cloud-native operations, and AI-driven automation, and drive their adoption within the organization.

Requirements

  • 9+ years of experience in IT operations, DevOps, or SRE, with at least 3 years in AIOps or AI/ML-driven automation.
  • Proven experience in technical leadership, team management, and cross-functional collaboration.
  • Deep expertise in AIOps platforms: Moogsoft, BigPanda, Splunk ITSI, ServiceNow ITOM, or custom ML-based solutions.
  • Strong proficiency in Python, with working knowledge of SQL, PowerShell, and Bash.
  • Hands-on experience with Azure, AWS, or GCP, including monitoring and automation services.
  • Skilled in CI/CD tools (Azure DevOps, GitHub Actions, Jenkins), IaC (Terraform, Ansible), and Kubernetes.
  • Familiarity with observability stacks (ELK, OpenTelemetry, Kafka) and data engineering workflows.
  • Excellent communication, stakeholder management, and problem-solving skills.
Benefits
  • Competitive salary
  • Health insurance
  • Professional development opportunities

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
AIOpsPythonSQLPowerShellBashCI/CDIaCKubernetesML modelsdata pipelines
Soft skills
technical leadershipteam managementcross-functional collaborationcommunicationstakeholder managementproblem-solvinginnovationcontinuous learningoperational excellencecost optimization