AI Engineer, Evaluation & Quality

Coupa Software

full-time

Posted on: 3/19/2026

Location Type: Remote

Location: India

Visit company website

Explore more

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

Python

About the role

Build and maintain automated evaluation pipelines for AI model quality.
Implement task-specific benchmarks and test suites.
Design quality dashboards tracking accuracy, regression, and safety metrics.
Implement automated regression testing for every model iteration.
Build comparison frameworks for side-by-side evaluation of model variants.
Analyze evaluation results to identify failure modes and report to the ML team.
Maintain evaluation datasets: versioning, quality validation, coverage analysis.
Support A/B testing infrastructure for production model validation.

Requirements

3+ years of software engineering or quality engineering experience.
Proficiency in Python with strong testing and automation skills.
Experience with statistical analysis and data visualization.
Understanding of ML model evaluation concepts (precision, recall, F1, human eval).
Experience building automated test frameworks and CI/CD pipelines.
Familiarity with dashboarding tools.
Strong analytical and problem-solving skills.
BS in Computer Science, Statistics, or equivalent experience.

Benefits

Pioneering Technology
Collaborative Culture
Global Impact

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Pythonautomated evaluation pipelinesstatistical analysisdata visualizationautomated test frameworksCI/CD pipelinesmodel evaluation conceptsaccuracy metricsregression testingA/B testing

Soft Skills

analytical skillsproblem-solving skills

Certifications

BS in Computer ScienceBS in Statistics