
Python Developer, AI Evaluation Frameworks
Michelin
full-time
Posted on:
Location Type: Hybrid
Location: Pune • India
Visit company websiteExplore more
About the role
- Design and implement AI evaluation frameworks and tooling for model assessment, benchmarking, and automated testing of LLMs, agents, and GenAI features.
- Build production‑grade Python applications, API’s to support evaluation pipelines and integrations.
- Collaborate with QA team brainstorm current evaluation challenges and build reproducible evaluation workflows.
- Implement end‑to‑end evaluation pipelines including data preprocessing, metric computation, test orchestration, and reporting.
- Ensure code quality and maintain coding standards through static analysis, unit/integration tests, code reviews, and tooling (e.g., SonarQube).
- Contribute to design and implementation of APIs and services.
- Deploy and operate evaluation components on Azure, leveraging platform services and following infrastructure‑as‑code practices.
- Instrument monitoring, logging, and alerting for evaluation pipelines; capture audit trails and results for compliance and reproducibility.
- Partner with data scientists, ML engineers, and product stakeholders to gather requirements, validate evaluation approaches, and incorporate feedback.
- Support peers in troubleshooting and resolving issues across development and QA; mentor junior developers and share best practices.
- Maintain documentation for evaluation frameworks, runbooks etc. Unit tests and unit plans are built, executed, optimized, monitored, ensuring quality, security and consistency. Malfunctions, incidents and bugs are detected, understood, analyzed, reported and solved.
Requirements
- 5–7 years of professional Python development experience with strong, demonstrable hands‑on skills.
- Solid understanding of OOPs concepts, software design principles, and coding best practices.
- Experience with test‑driven development, writing unit and integration tests, and collaborating with QA teams on automated testing.
- Familiarity with the full project lifecycle: requirements, design, development, code review, deployment, maintenance, and deprecation.
- Experience building RESTful APIs using FastAPI, Flask, or Django.
- Practical experience with Azure cloud services and deployment patterns (App Services, AKS, Azure Functions, Blob/Storage, DevOps pipelines).
- Exposure to CI/CD tooling and code quality tools such as SonarQube
- Working knowledge of AI/DS concepts—particularly GenAI, LLMs, RAG patterns, and agent architectures.
- Strong problem solving, debugging skills, and ability to work across distributed systems.
- Excellent communication skills and demonstrated ability to work closely with QA, data science, and product teams.
Benefits
- Health insurance
- Flexible work hours
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonOOPtest-driven developmentunit testingintegration testingRESTful APIsFastAPIFlaskDjangoAI/DS concepts
Soft Skills
problem solvingdebuggingcommunicationcollaborationmentoringtroubleshootinganalytical skillsattention to detailadaptabilityteamwork