
Senior Software Engineer, AI Eval
Sentry
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • United States
Visit company websiteExplore more
Salary
💰 $240,000 - $280,000 per year
Job Level
Tech Stack
About the role
- Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
- Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
- Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
- Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
- Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring
Requirements
- Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field
- Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
- Comfort writing production-quality code (we use Python and TypeScript)
- Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
- Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)
- Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools
Benefits
- A successful candidate will be eligible to participate in Sentry’s employee benefit plans/programs applicable to the candidate’s position (including incentive compensation, equity grants, paid time off, and group health insurance coverage).
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonTypeScriptAI systemsmachine learningevaluation frameworksautomated test harnessesmetrics pipelinesdata infrastructuredata quality pipelinesregression testing
Soft skills
collaborationcommunicationproblem-solvingorganizational skillsattention to detail
Certifications
Bachelor's degree in computer scienceBachelor's degree in machine learning