Capital One

Lead Machine Learning Engineer – MLOps, KServe, Kubernetes, PyTorch, TensorFlow, AWS

Capital One

full-time

Posted on:

Location Type: Hybrid

Location: McLeanCaliforniaMassachusettsUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $179,400 - $245,600 per year

Job Level

About the role

  • Participate in the detailed technical design, development, and implementation of machine learning applications using existing and emerging technology platforms
  • Focus on machine learning architectural design, develop and review model and application code, and ensure high availability and performance of machine learning applications
  • Continuously learn and apply the latest innovations and best practices in machine learning engineering
  • Design, build, and/or deliver ML models and components that solve real-world business problems, while working in collaboration with the Product and Data Science teams
  • Inform ML infrastructure decisions using understanding of ML modeling techniques and issues
  • Solve complex problems by writing and testing application code, developing and validating ML models, and automating tests and deployment
  • Collaborate as part of a cross-functional Agile team to create and enhance software that enables state-of-the-art big data and ML applications
  • Retrain, maintain, and monitor models in production
  • Leverage or build cloud-based architectures, technologies, and/or platforms to deliver optimized ML models at scale
  • Construct optimized data pipelines to feed ML models
  • Leverage continuous integration and continuous deployment best practices, including test automation and monitoring, to ensure successful deployment of ML models and application code
  • Ensure all code is well-managed to reduce vulnerabilities, models are well-governed from a risk perspective, and the ML follows best practices in Responsible and Explainable AI

Requirements

  • Bachelor’s degree
  • At least 6 years of experience designing and building data-intensive solutions using distributed computing (Internship experience does not apply)
  • At least 4 years of experience programming with Python, Scala, or Java
  • At least 2 years of experience building, scaling, and optimizing ML systems
  • Master's or doctoral degree in computer science, electrical engineering, mathematics, or a similar field (Preferred)
  • 3+ years of experience building production-ready data pipelines that feed ML models (Preferred)
  • 3+ years of on-the-job experience with an industry recognized ML framework such as scikit-learn, PyTorch, Dask, Spark, or TensorFlow (Preferred)
  • 2+ years of experience developing performant, resilient, and maintainable code (Preferred)
  • 2+ years of experience with data gathering and preparation for ML models (Preferred)
  • 2+ years of people leader experience (Preferred)
  • 1+ years of experience leading teams developing ML solutions using industry best practices, patterns, and automation (Preferred)
  • Experience developing and deploying ML solutions in a public cloud such as AWS, Azure, or Google Cloud Platform (Preferred)
  • Experience designing, implementing, and scaling complex data pipelines for ML models and evaluating their performance (Preferred)
Benefits
  • performance based incentive compensation
  • cash bonus(es)
  • long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
machine learningPythonScalaJavaML frameworksscikit-learnPyTorchDaskSparkTensorFlow
Soft Skills
collaborationproblem-solvingleadershipcommunicationagile methodology
Certifications
Bachelor's degreeMaster's degreeDoctoral degree