FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSAzureCloudDockerGoogle Cloud PlatformJenkinsPythonSQL
About the role
Key responsibilities & impact- Design and build evaluation harnesses for agentic systems in Python
- Author automated test suites for prompts and workflows
- Validate guardrails around tool execution
- Wire evaluations into CI using Dataiku Evaluations, GitHub Actions or Jenkins
- Build observability into testing
- Own quality end-to-end
- Partner with data engineers on retrieval testing patterns
- Help shape internal QA standards for AI & Data engineering
- Participate in a collaborative DevOps environment
Requirements
What you’ll need- 6+ years of experience in QA / SDET
- 3+ years of professional experience automating tests for backend services or data pipelines
- 1+ years of hands-on experience testing LLM or AI features in production
- Working knowledge of evaluation frameworks such as RAGAS, DeepEval, LangSmith or comparable LLM-as-judge tooling
- Strong Python and PyTest skills
- Solid SQL skills
- Familiarity with at least one cloud platform (AWS, Azure or GCP)
- Fluency with Git, Docker, REST APIs and at least one CI tool
- Solid understanding of data security and responsible AI practices
- Proven ability to work independently and within a team
- Strong written and verbal communication skills
- A bachelor’s degree is not required — equivalent practical experience counts
Benefits
Comp & perks- Comprehensive health coverage
- Well-being perks supporting teammates and their dependents
- Flexible time off
- Recognition of public holidays
- Opportunities for charitable partnerships and volunteer work
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonPyTestSQLautomated testingevaluation frameworksRAGASDeepEvalLangSmithCI toolsdata security
Soft Skills
independent workteam collaborationwritten communicationverbal communication
