Develop a QA framework tailored to GenAI systems and workflows.
Design tests for: Prompt behavior across varied inputs and user tasks; Hallucination detection; Factual consistency and groundedness.
Blend manual and automated test design for both deterministic and stochastic outputs.
Collaborate with teams to obtain or create sample data with clear target outputs.
Own end-to-end test strategy and execution for GenAI-powered features.
Ensure coverage across: Diverse prompt phrasing, user intents, and failure modes; Multiple GenAI features (e.g., summarization, generation, classification); High-risk, edge-case, and compliance-driven scenarios.
Lead the design and implementation of prompt and model evaluation protocols: Alignment between user input and intended behavior; Output fluency, tone, and coherence; Clarity, coverage, and relevance of responses.
Use Golden datasets and benchmark prompts to establish evaluation baselines.
Design and manage SME-driven review workflows.
Facilitate structured reviews focused on: Correctness/accuracy based on metrics and SME feedback; Capturing edge-case failures.
Define and track QA effectiveness using metrics such as: Pass rate for high-risk use-cases; HITL reviewer agreement rates and flagging critical issues; Use-case specific measures of “quality”.
Deliver clear, actionable dashboards and reports to leadership on AI quality, safety, and readiness.
Requirements
Are excited by the complexities and challenges of GenAI testing.
Think like a product owner, act like a tester, and communicate like a coach.
Thrive in ambiguity and enjoy shaping new standards.
Are passionate about safe, responsible AI development.
Benefits
100% Remote Work: Enjoy the freedom to work from the location that helps you thrive. All it takes is a laptop and a reliable internet connection.
Highly Competitive USD Pay: Earn an excellent, market-leading compensation in USD, that goes beyond typical market offerings.
Paid Time Off: We value your well-being. Our paid time off policies ensure you have the chance to unwind and recharge when needed.
Work with Autonomy: Enjoy the freedom to manage your time as long as the work gets done. Focus on results, not the clock.
Work with Top American Companies: Grow your expertise working on innovative, high-impact projects with Industry-Leading U.S. Companies.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.