
Staff Machine Learning Engineer
Wand AI
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Architect and lead the development of scalable ML platforms that support autonomous, goal-driven AI agents.
- Design systems that support the full ML lifecycle, including agentic decision-making, task orchestration, and automated goal execution.
- Build frameworks for integrating models with product logic, business objectives, and operational workflows.
- Lead the development of pipelines that enable experimentation, productionization, and continuous agentic learning.
- Define architecture standards and engineering practices for agentic AI, goal alignment, and productized ML solutions.
- Collaborate with data science and product teams to turn research outputs into production AI agents that drive real product impact.
- Design infrastructure supporting large-scale training, inference, and multi-agent coordination workloads.
- Strengthen observability and monitoring across pipelines, AI agents, and goal-driven behavior execution.
- Implement systems for automated evaluation, goal alignment checks, drift detection, and retraining.
- Improve reliability, scalability, and operational excellence of ML services powering autonomous workflows.
- Lead troubleshooting of complex agentic system failures and distributed ML infrastructure issues.
- Influence CI/CD and development workflows supporting ML lifecycle, agent orchestration, and automated deployment.
- Mentor engineers to build expertise in agentic systems, AI-driven product logic, and autonomous workflows.
- Collaborate with architects and senior engineers to shape long-term AI platform strategy and agentic product roadmaps.
Requirements
- Extensive hands-on experience building production ML systems integrated with product goals and business logic.
- Deep expertise in agentic AI, ML engineering, and MLOps practices.
- Strong programming skills in Python and experience integrating ML with backend systems and autonomous workflows.
- Proven experience deploying machine learning models at scale, including goal-driven or multi-agent systems.
- Experience building ML infrastructure for training, experimentation, inference, and agent coordination.
- Strong understanding of distributed systems, scalable data pipelines, and real-time agentic decision loops.
- Experience designing ML systems on cloud platforms such as AWS, Azure, or GCP.
- Experience building highly available model serving systems supporting autonomous agentic tasks.
- Ability to influence architecture and product integration decisions across engineering teams.
- Strong debugging and troubleshooting skills in complex production ML and agentic AI environments.
- Ability to lead complex technical initiatives without formal management authority.
- Excellent communication skills to work effectively across engineering, product, and data science teams.
Benefits
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningMLOpsPythonagentic AIdistributed systemsscalable data pipelinesmodel servingcloud platformsinfrastructure designautomated evaluation
Soft Skills
leadershipcommunicationtroubleshootingmentoringcollaborationinfluenceproblem-solvingorganizational skillstechnical initiative leadershipcross-functional teamwork