
Director, Software Engineering, Information Security
Walmart
full-time
Posted on:
Location Type: Hybrid
Location: Bentonville • Virginia • United States
Visit company websiteExplore more
Salary
💰 $130,000 - $260,000 per year
Job Level
About the role
- Lead the transformation of the Operational Excellence function into a world-class Information Security Reliability Engineering practice.
- Lead the implementation of InfoSec’s strategic roadmap for HA Ops, using modern tooling to reduce alert fatigue and predict system anomalies.
- Collaborate with Fellows and Engineers to adopt and implement HA observability standards.
- Direct your team to build and maintain complex ServiceNow workflows and catalog items.
- Automate manual request fulfillment (e.g., access grants, firewall changes) to reduce toil and improve delivery speed.
- Oversee the management of standardized change processes, utilizing intelligent automation to streamline approvals and compliance checks.
- Enhance change management reporting, delivering comprehensive analytics on stability and velocity.
- Use GTP tooling to establish a robust framework for incident and problem management, including AI-assisted post-incident debriefs and automated root-cause analysis drafting.
- Orchestrate InfoSec’s decisive response to P-Level incidents, minimizing Mean Time to Mitigate (MTTM).
- Collaborate with Engineers and Fellows to establish and enforce standards for High Availability (HA) and reliability across all InfoSec applications.
- Develop real-time application health dashboards and integrated performance monitoring.
- Champion "Self-Healing" infrastructure initiatives, using automation to detect and recover system anomalies.
- Support InfoSec teams in meeting operational excellence targets and maintaining availability per system SLO.
- Coordinate with Global Tech and InfoSec engineering teams to ensure alignment on reliability goals.
- Partner with service owners to ensure required observability and AI-based monitoring tools are correctly implemented.
- Provide leadership and mentorship to the engineering on-call rotation, fostering a culture of blameless retrospectives.
Requirements
- Option 1: Bachelor's degree in computer science, information technology, engineering, information systems, cybersecurity, or related area and 6 years’ experience in software engineering or related area at a technology, retail, or data-driven company.
- Option 2: 8 years’ experience in software engineering or related area at a technology, retail, or data-driven company. 3 years’ supervisory experience.
- 10+ years of experience in Reliability Engineering, DevOps, or Information Security Operations, with at least 5 years in a leadership role.
- Proven experience implementing observability and APM platforms (e.g., Dynatrace, Datadog Watchdog) and integrating LLM/Generative AI tools to accelerate engineering workflows.
- Extensive experience managing P1/P0 critical incidents and leading "war rooms" under pressure.
- Strong understanding of Cloud Infrastructure (Azure/GCP), Infrastructure as Code (Terraform/Ansible), and modern observability stacks.
- Ability to distinguish between vanity metrics and actionable data, using analytics to drive continuous improvement in system stability.
Benefits
- Health benefits include medical, vision and dental coverage.
- Financial benefits include 401(k), stock purchase and company-paid life insurance.
- Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting.
- Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more.
- Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Reliability EngineeringDevOpsInformation Security OperationsObservabilityApplication Performance Monitoring (APM)Cloud InfrastructureInfrastructure as CodeAutomationAnalyticsIncident Management
Soft Skills
LeadershipMentorshipCollaborationProblem-solvingCommunicationCrisis ManagementContinuous ImprovementBlameless RetrospectivesTeam CoordinationStrategic Planning