FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead Machine Learning Engineer – MLOps, KServe, Kubernetes, PyTorch, TensorFlow, AWS
Capital OneLead Machine Learning Engineer at Capital One responsible for productionizing ML applications and collaborating with cross-functional teams to drive AI innovation. Engaging in design, development, and implementation of scalable solutions across the company.
Posted 5/19/2026full-timeMcLean • California, Massachusetts, New York, Virginia • 🇺🇸 United StatesSenior💰 $179,400 - $245,600 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureCloudGoogle Cloud PlatformJavaPythonPyTorchScalaScikit-LearnSparkTensorflow
About the role
Key responsibilities & impact- Participate in the detailed technical design, development, and implementation of machine learning applications using existing and emerging technology platforms.
- Focus on machine learning architectural design, develop and review model and application code, and ensure high availability and performance of our machine learning applications.
- Continuously learn and apply the latest innovations and best practices in machine learning engineering.
- Design, build, and/or deliver ML models and components that solve real-world business problems, while working in collaboration with the Product and Data Science teams.
- Inform ML infrastructure decisions using understanding of ML modeling techniques and issues, including choice of model, data, and feature selection, model training, hyperparameter tuning, dimensionality, bias/variance, and validation.
- Solve complex problems by writing and testing application code, developing and validating ML models, and automating tests and deployment.
- Collaborate as part of a cross-functional Agile team to create and enhance software that enables state-of-the-art big data and ML applications.
- Retrain, maintain, and monitor models in production.
- Leverage or build cloud-based architectures, technologies, and/or platforms to deliver optimized ML models at scale.
- Construct optimized data pipelines to feed ML models.
- Leverage continuous integration and continuous deployment best practices, including test automation and monitoring, to ensure successful deployment of ML models and application code.
- Ensure all code is well-managed to reduce vulnerabilities, models are well-governed from a risk perspective, and the ML follows best practices in Responsible and Explainable AI.
Requirements
What you’ll need- Bachelor’s degree
- At least 6 years of experience designing and building data-intensive solutions using distributed computing (Internship experience does not apply)
- At least 4 years of experience programming with Python, Scala, or Java
- At least 2 years of experience building, scaling, and optimizing ML systems
- Master's or doctoral degree in computer science, electrical engineering, mathematics, or a similar field (Preferred)
- 3+ years of experience building production-ready data pipelines that feed ML models (Preferred)
- 3+ years of on-the-job experience with an industry recognized ML framework such as scikit-learn, PyTorch, Dask, Spark, or TensorFlow (Preferred)
- 2+ years of experience developing performant, resilient, and maintainable code (Preferred)
- 2+ years of experience with data gathering and preparation for ML models (Preferred)
- 2+ years of people leader experience (Preferred)
- 1+ years of experience leading teams developing ML solutions using industry best practices, patterns, and automation (Preferred)
- Experience developing and deploying ML solutions in a public cloud such as AWS, Azure, or Google Cloud Platform
Benefits
Comp & perks- Comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
- Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningPythonScalaJavaML frameworksscikit-learnPyTorchDaskSparkTensorFlow
Soft Skills
collaborationproblem-solvingleadershipcommunicationAgile methodology
Certifications
Bachelor's degreeMaster's degreeDoctoral degree