Arabic (Levantine) AI Evaluation Specialist

Welocalize

full-time

Posted on: 1/15/2026

Location Type: Remote

Location: Egypt

Visit company website

Explore more

Artificial Intelligence jobs

✨ AI Apply

Apply

Salary

💰 $10 per hour

Job Level

Mid-Level Senior

About the role

Design scenario-based and edge-case prompts to test AI behavior, including trick and incomplete-information cases.
Develop evaluation rubrics to assess AI responses across instruction-following, factuality, tone, safety, refusals, and helpfulness.
Perform side-by-side evaluations of AI outputs and score them on a 1–5 scale using defined criteria.
Create high-quality source documents (articles, transcripts, reports) as the single source of truth for testing.
Write accurate and well-structured Golden Responses that correctly follow instructions and handle ambiguity.

Requirements

Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field.
B2 or superior level of English.
Native fluency in Modern Standard Arabic in Levantine dialect.
Strong understanding of the distinction between Fusha and ‘Ammiyya
Proven experience in a role involving AI data annotation, content quality review, search quality rating, or prompt engineering.
Ability to work independently and manage workflows effectively in a remote environment.
Multilingual proficiency in one or more Arabic dialects is a plus.
Strong attention to detail and critical thinking to identify hallucinations and bias.
Familiarity with data annotation platforms and model evaluation tools is a plus.
Cultural familiarity with regional norms and high-context communication styles, particularly in the GCC region.

Benefits

Limitless Flexibility
Limitless Growth
Limitless Support
Real Impact

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills

AI data annotationcontent quality reviewsearch quality ratingprompt engineeringevaluation rubricsGolden Responsesscenario-based testingedge-case testingcritical thinkingattention to detail

Soft skills

independent workworkflow managementcommunicationcultural familiarityhigh-context communication