FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSAzureCloudGoogle Cloud PlatformPython
About the role
Key responsibilities & impact- Act as the AI Evaluation & Quality engineering owner within AI Factory / POD based delivery models
- Partner with AI Engineers, backend teams, product owners, and architects to define AI acceptance criteria and quality gates
- Support end to end validation of GenAI application architectures, including API layers, orchestration logic, and backend services
- Evaluate AI readiness across environments (DEV / QA / PROD) and ensure quality consistency across deployments
- Contribute to enterprise AI solution design discussions with a focus on trust, reliability
- Collaborate closely with product, UX, and platform teams to align AI quality with functional and business goals
- Evaluate and benchmark AI outputs across multiple LLMs (model agnostic), including GPT, Claude, LLaMA, Gemini, and enterprise models
- Validate Retrieval Augmented Generation (RAG) pipelines end to end, including retrieval accuracy, grounding, chunking strategies, and edge cases
- Implement automated AI evaluation pipelines using DeepEval and RAGAs to assess: Accuracy and correctness, Faithfulness and hallucination risk, Relevance and contextual grounding, Consistency and reproducibility, Safety, toxicity, and policy compliance
Requirements
What you’ll need- 5–7+ years of hands-on experience in AI Evaluation Engineering, combining expertise in Quality Engineering and Test Automation for AI and GenAI systems
- Strong proficiency in Python, with deep experience building PyTest based automation frameworks
- Hands on experience with DeepEval and RAGAs for LLM and RAG evaluation
- Solid understanding of LLM behavior, prompt engineering concepts, and RAG architectures
- Experience testing web based AI applications, APIs, and backend services
- Strong analytical skills to reason about non deterministic and probabilistic AI outputs
- Working knowledge of cloud platforms (Azure / AWS / GCP) and CI/CD pipelines, Azure exp is must.
Benefits
Comp & perks- Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
- Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
- Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
- Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AI Evaluation EngineeringQuality EngineeringTest AutomationPythonPyTestDeepEvalRAGAsLLM behaviorprompt engineeringRAG architectures
Soft Skills
analytical skillscollaborationcommunication
