LILT AI

Software Engineering, DevOps AI Rater/Evaluator

LILT AI

contract

Posted on:

Location Type: Remote

Location: United Arab Emirates

Visit company website

Explore more

AI Apply
Apply

About the role

  • Evaluate AI outputs related to software engineering, DevOps, and infrastructure topics
  • Perform structured scoring, comparison, classification, and judgment tasks
  • Assess technical correctness, completeness, security implications, and best-practice alignment
  • Identify hallucinations, incorrect code, unsafe recommendations, or misleading system guidance
  • Apply domain-specific engineering and DevOps guidelines consistently across tasks
  • Validate and refine evaluation rubrics and edge-case handling
  • Perform adjudication where raters disagree
  • Conduct error analysis and qualitative reviews of model behavior
  • Partner with LILT research, product, and customer teams on evaluation design
  • Support red-teaming, security review, and model readiness assessments

Requirements

  • Software engineers, site reliability engineers, DevOps engineers, or platform engineers
  • Experience with production systems, CI/CD pipelines, cloud infrastructure, or distributed systems
  • Strong attention to detail and comfort working with structured evaluation criteria
  • Native or professional fluency in one or more supported languages is required
  • English fluency is required for guidelines, feedback, and collaboration.
Benefits
  • Contract-based, flexible participation
  • Project-based work with clear expectations and timelines
  • Opportunities for recurring work based on performance and demand
  • Compensation communicated upfront per project or task type
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AI evaluationsoftware engineeringDevOpsinfrastructureerror analysisCI/CD pipelinescloud infrastructuredistributed systemstechnical correctnessbest-practice alignment
Soft Skills
attention to detailstructured evaluationcollaborationjudgmentqualitative reviews