FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead System Engineer – AI Automation, SRE Focus
AT&TLead AI Automation Engineer with a focus on SRE and AI-driven automation at AT&T. Design, implement, and operate intelligent reliability capabilities across critical platforms.
Posted 5/12/2026full-timePlano • Texas, Washington • 🇺🇸 United StatesSenior💰 $158,200 - $237,400 per yearWebsite
Tech Stack
Tools & technologiesAzureCloudDockerERPGrafanaKafkaKubernetesOraclePythonShell ScriptingSplunkSQL
About the role
Key responsibilities & impact- Architect and deliver AI-powered automation solutions for production operations, including intelligent incident triage, root cause analysis, remediation, and prevention
- Design Agentic AI workflows that autonomously monitor systems, analyze anomalies, trigger corrective actions, and orchestrate recovery across ERP, supply chain, and integration layers
- Apply AIOps techniques to correlate metrics, logs, events, and traces for predictive alerting, noise reduction, and proactive reliability improvements
- Develop LLM-enabled runbooks and intelligent assistants to guide operational decision-making, accelerate incident response, and upskill operations teams
- Own platform stability, uptime, and performance across Oracle EBS/ERP, Oracle Fusion Cloud, and supply chain execution systems
- Lead incident management, coordinating rapid response, containing impact, and ensuring SLA adherence
- Conduct blameless postmortems, using AI-assisted RCA to identify systemic issues and drive automation-first corrective actions
- Provide advanced production support for Oracle EBS/ERP modules including Procurement, Order Management, Inventory, AR, AP, FA, Project Accounting, and Supply Chain Planning
- Troubleshoot complex issues across configuration, master data, transactions, batch jobs, interfaces, and integrations, leveraging deep SQL and system-level analysis
- Monitor and support 3rd-party platforms (O9, Blue Yonder/JDA, RELEX) and integrations with WMS, 3PL, and logistics providers
- Build and evolve AI-augmented observability solutions using tools such as Dynatrace, AppDynamics, Splunk, ELK, Grafana, and custom ML models
- Implement predictive health monitoring, capacity forecasting, and intelligent service-level indicators (SLIs/SLOs)
- Collaborate with middleware, cloud, and vendor teams to resolve cross-system defects, data mismatches, latency issues, and sequencing problems
- Support release management, ensuring changes meet reliability, security, and performance standards
- Participate in on-call rotations, with a strong emphasis on automation and AI-driven reduction of recurring incidents.
Requirements
What you’ll need- 10+ years of experience across enterprise application engineering, SRE, and production operations, with an automation-first mindset
- Proven experience driving AI-based automation, AIOps, or intelligent operational tooling in complex enterprise environments
- Strong ownership mentality for system reliability, performance, and customer impact
- Hands-on experience with Generative AI, LLMs, or Agentic AI frameworks applied to automation, monitoring, or operations
- Proficiency in Python, Shell scripting, SQL/PLSQL, and automation frameworks
- Experience building AI-enhanced runbooks, chatbots, or autonomous operational workflows
- Ability to translate operational patterns into repeatable, intelligent automation
- Deep experience with Oracle EBS and/or Oracle Fusion Cloud (AR, AP, FA, PO, INV, OM, PA, Planning)
- Strong knowledge of observability platforms: Dynatrace, AppDynamics, Splunk, ELK, Grafana
- Experience with integration technologies: Oracle SOA/OIC, MuleSoft, Kafka/JMS, EDI
- Familiarity with containers and cloud platforms (Docker, Kubernetes, Azure).
- Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related field.
Benefits
Comp & perks- Medical/Dental/Vision coverage
- 401(k) plan
- Tuition reimbursement program
- Paid Time Off and Holidays (at least 23 days of vacation each year and 9 company-designated holidays)
- Paid Parental Leave
- Paid Caregiver Leave
- Additional sick leave beyond what state and local law require may be available but is unprotected
- Adoption Reimbursement
- Disability Benefits (short term and long term)
- Life and Accidental Death Insurance
- Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
- Employee Assistance Programs (EAP)
- Extensive employee wellness programs
- Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phone.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonShell scriptingSQLPLSQLAIOpsGenerative AILLMsautomation frameworksintegration technologiesobservability platforms
Soft Skills
ownership mentalityautomation-first mindsetcollaborationincident managementproblem-solvingcommunicationleadershipanalytical thinkingdecision-makingadaptability
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in EngineeringBachelor’s degree in Information Technology