
Arabic AI Evaluation Specialist
Welocalize
full-time
Posted on:
Location Type: Remote
Location: Egypt
Visit company websiteExplore more
Salary
💰 $10 per hour
About the role
- Conduct side-by-side comparisons of AI responses and rate their quality on a 1–5 scale based on established guidelines.
- Design scenario-based and edge-case prompts to evaluate model behavior, including tricky, ambiguous, or incomplete information situations.
- Assess outputs for instruction adherence, factual accuracy, tone, safety, and overall usefulness.
- Develop clear evaluation rubrics and criteria to ensure consistent scoring across tasks.
- Create reliable reference materials (articles, transcripts, reports, etc.) to serve as the source of truth for testing.
- Write well-structured “gold standard” responses that demonstrate the most accurate and helpful answer.
- Identify potential issues such as hallucinations, inconsistencies, or cultural/contextual mismatches.
Requirements
- Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
- B2 or superior level of English.
- Native fluency in Modern Standard Arabic in Egyptian dialect.
- Strong understanding of the distinction between Fusha and ‘Ammiyya.
- Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
- Ability to work independently and manage workflows effectively in a remote environment.
- Nice to Have: Multilingual proficiency in one or more Arabic dialects.
- Strong attention to detail and critical thinking to identify hallucinations and bias.
- Familiarity with data annotation platforms and model evaluation tools.
- Experience in prompt engineering, AI evaluation, linguistic QA, or translation is a plus.
- Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.
Benefits
- Limitless Flexibility
- Limitless Growth
- Limitless Support
- Real Impact
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AI data annotationcontent quality reviewsearch quality ratingprompt engineeringlinguistic QAtranslationevaluation rubricscritical thinkingfactual accuracy assessmentinstruction adherence
Soft Skills
attention to detailindependent workworkflow managementcultural familiarityhigh-context communication