
Senior Lead AI Engineer, Data
Coupa Software
full-time
Posted on:
Location Type: Remote
Location: India
Visit company websiteExplore more
Job Level
About the role
- Lead the design and implementation of data pipelines that prepare high-quality training data for AI models.
- Build data curation workflows that transform raw enterprise data into labeled, validated datasets.
- Design data quality frameworks: validation, profiling, anomaly detection, lineage tracking.
- Extend existing anonymized data export pipelines to support AI training workloads.
- Implement synthetic data generation pipelines.
- Design schema mappings across 197+ enterprise tables for feature extraction.
- Collaborate with ML engineers on training data format requirements.
- Establish data catalog and metadata management for AI training artifacts.
Requirements
- 10+ years of software engineering experience, with 5+ years in data engineering.
- Strong experience with Apache Spark / PySpark and large-scale data processing.
- Experience building ETL/ELT pipelines on cloud infrastructure (managed Spark, object storage, managed ETL, or equivalent).
- Knowledge of data quality frameworks and data governance.
- Experience with data anonymization and privacy-preserving data processing.
- Understanding of ML training data requirements.
- Proficiency in Python and SQL.
- Experience with data catalog tools and metadata management.
- BS/MS in Computer Science or equivalent experience.
- Experience in B2B SaaS with multi-tenant data preferred.
Benefits
- Pioneering Technology: At Coupa, we're at the forefront of innovation, leveraging the latest technology to empower our customers with greater efficiency and visibility in their spend.
- Collaborative Culture: We value collaboration and teamwork, and our culture is driven by transparency, openness, and a shared commitment to excellence.
- Global Impact: Join a company where your work has a global, measurable impact on our clients, the business, and each other.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
data pipelinesdata curation workflowsdata quality frameworksanomaly detectionsynthetic data generationschema mappingsApache SparkPySparkPythonSQL
Certifications
BS in Computer ScienceMS in Computer Science