Architect, design, and develop the next generation of our AI/ML infrastructure, focusing on capabilities that support agentic frameworks and production workloads.
Develop and execute a comprehensive technical roadmap, defining success metrics and championing the platform's adoption across the organization.
Transition AI/ML models from proof-of-concept to production-ready pipelines using MLOps frameworks, ensuring solutions are scalable, reliable, and maintainable.
Partner with data analytics and data science teams to understand their requirements, iterate on solutions, and provide expert technical guidance.
Stay on the cutting edge of Generative AI and other emerging technologies, contributing to continuous platform improvement and fostering a culture of technical excellence.
Champion best practices for containerization and orchestration technologies to optimize data processing and inference for large-scale models.
Effectively communicate project progress, technical challenges, and solutions to a wide range of audiences, from technical teams to senior leadership.
Requirements
Option 1- Bachelor’s degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology, or related field and 3 years' experience in an analytics related field.
Option 2- Master’s degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology, or related field and 1 years' experience in an analytics related field.
Option 3 - 5 years' experience in an analytics or related field.
Hands-on experience deploying, managing, and operating AI/ML workloads in a production environment.
Hands-on experience developing, deploying, and maintaining AI agents in production environments.
Deep understanding of ML frameworks and commercial AI/ML infrastructure, such as PyTorch, TensorFlow, KubeFlow/MLFlow, and HuggingFace, and agentic frameworks like Langchain, ADK, AutoGen, etc.
Proven experience with containerization and orchestration technologies, such as Docker and Kubernetes.
Experience with distributed training techniques like data parallelism, tensor parallelism, and pipeline parallelism.
Strong technical acumen with the ability to act as a credible technical advisor, setting and enforcing high-quality standards for code and system design.
Demonstrate ability to develop, drive, and execute a technical vision and roadmap that aligns with business objectives.
Experience building high-traffic distributed systems and ML infrastructure, ensuring scalability and resilience.
Exceptional communication and collaboration skills, with the ability to articulate complex technical concepts clearly to audiences.
Experience with the design and development of Generative AI models and technologies is preferred.
Benefits
Health benefits include medical, vision and dental coverage.
Financial benefits include 401(k), stock purchase and company-paid life insurance.
Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting.
Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more.
You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment.
Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart. Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
AI infrastructureML infrastructureMLOps frameworkscontainerizationorchestration technologiesPyTorchTensorFlowKubeFlowMLFlowHuggingFace