Python Developer, AI Evaluation Frameworks

Michelin

full-time

Posted on: 4/7/2026

Location Type: Hybrid

Location: Pune • India

Visit company website

Explore more

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

Azure Cloud Distributed Systems Django Flask Python

About the role

Design and implement AI evaluation frameworks and tooling for model assessment, benchmarking, and automated testing of LLMs, agents, and GenAI features.
Build production‑grade Python applications, API’s to support evaluation pipelines and integrations.
Collaborate with QA team brainstorm current evaluation challenges and build reproducible evaluation workflows.
Implement end‑to‑end evaluation pipelines including data preprocessing, metric computation, test orchestration, and reporting.
Ensure code quality and maintain coding standards through static analysis, unit/integration tests, code reviews, and tooling (e.g., SonarQube).
Contribute to design and implementation of APIs and services.
Deploy and operate evaluation components on Azure, leveraging platform services and following infrastructure‑as‑code practices.
Instrument monitoring, logging, and alerting for evaluation pipelines; capture audit trails and results for compliance and reproducibility.
Partner with data scientists, ML engineers, and product stakeholders to gather requirements, validate evaluation approaches, and incorporate feedback.
Support peers in troubleshooting and resolving issues across development and QA; mentor junior developers and share best practices.
Maintain documentation for evaluation frameworks, runbooks etc. Unit tests and unit plans are built, executed, optimized, monitored, ensuring quality, security and consistency. Malfunctions, incidents and bugs are detected, understood, analyzed, reported and solved.

Requirements

5–7 years of professional Python development experience with strong, demonstrable hands‑on skills.
Solid understanding of OOPs concepts, software design principles, and coding best practices.
Experience with test‑driven development, writing unit and integration tests, and collaborating with QA teams on automated testing.
Familiarity with the full project lifecycle: requirements, design, development, code review, deployment, maintenance, and deprecation.
Experience building RESTful APIs using FastAPI, Flask, or Django.
Practical experience with Azure cloud services and deployment patterns (App Services, AKS, Azure Functions, Blob/Storage, DevOps pipelines).
Exposure to CI/CD tooling and code quality tools such as SonarQube
Working knowledge of AI/DS concepts—particularly GenAI, LLMs, RAG patterns, and agent architectures.
Strong problem solving, debugging skills, and ability to work across distributed systems.
Excellent communication skills and demonstrated ability to work closely with QA, data science, and product teams.

Benefits

Health insurance
Flexible work hours
Professional development opportunities

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonOOPtest-driven developmentunit testingintegration testingRESTful APIsFastAPIFlaskDjangoAI/DS concepts

Soft Skills

problem solvingdebuggingcommunicationcollaborationmentoringtroubleshootinganalytical skillsattention to detailadaptabilityteamwork