
Software Engineer – DevOps AI Rater/Evaluator
LILT AI
contract
Posted on:
Location Type: Remote
Location: Germany
Visit company websiteExplore more
Tech Stack
About the role
- Evaluate AI outputs related to software engineering, DevOps, and infrastructure topics
- Perform structured scoring, comparison, classification, and judgment tasks
- Assess technical correctness, completeness, security implications, and best-practice alignment
- Identify hallucinations, incorrect code, unsafe recommendations, or misleading system guidance
- Apply domain-specific engineering and DevOps guidelines consistently across tasks
- Validate and refine evaluation rubrics and edge-case handling
- Perform adjudication where raters disagree
- Conduct error analysis and qualitative reviews of model behavior
- Partner with LILT research, product, and customer teams on evaluation design
- Support red-teaming, security review, and model readiness assessments
Requirements
- Software engineers, site reliability engineers, DevOps engineers, or platform engineers
- Experience with production systems, CI/CD pipelines, cloud infrastructure, or distributed systems
- Strong attention to detail and comfort working with structured evaluation criteria
- Native or professional fluency in one or more supported languages is required
- English fluency is required for guidelines, feedback, and collaboration.
Benefits
- Contract-based, flexible participation
- Project-based work with clear expectations and timelines
- Opportunities for recurring work based on performance and demand
- Compensation communicated upfront per project or task type
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AI evaluationsoftware engineeringDevOpsinfrastructureerror analysisCI/CD pipelinescloud infrastructuredistributed systemstechnical correctnessbest-practice alignment
Soft Skills
attention to detailstructured evaluationcollaborationjudgmentqualitative reviews