Senior Software Engineer, AI

Lattice

Senior Software Engineer on the AI Engineering team at Lattice. Designing and implementing systems for AI performance evaluation and quality improvement.

Posted 5/27/2026full-timeRemote • 🇺🇸 United StatesSenior💰 $160,000 - $185,000 per yearWebsite

ATS Keywords

Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills

AI evaluation frameworkRAG pipelinesLLM-based systemsprompt engineeringagent orchestrationevaluation metricsmodel fine-tuningproduction-grade Pythonmulti-step workflowsvector databases

Soft Skills

project ownershiptechnical directioncollaborationcommunicationcritical thinkingcode reviewdocumentationtechnical debateproblem-solvingdata-driven decision making

Tools & Technologies

LangGraphLangSmithAWSGCPcloud-native architecturePineconeautomated scoring pipelinestest harnessesobservability toolingvector store management

Industry Keywords

AI/ML systemsagentic AI systemsevaluation frameworksstatisticsexperimentsproduction systemsuser engagementhallucination ratesresponse qualitybusiness outcomes

Tech Stack

Tools & technologies

AWSCloudGoogle Cloud PlatformPythonTypeScript

About the role

Key responsibilities & impact

Design and ship a robust, end-to-end AI evaluation framework, covering offline evals, production tracing, and human-in-the-loop feedback loops, connected across all of Lattice’s AI use cases.
Define and instrument the metrics that actually matter: agent task completion, hallucination rates, response quality, user engagement, and downstream business outcomes.
Build and maintain evaluation datasets, test harnesses, and automated scoring pipelines to catch regressions before they ship.
Identify and surface the drivers of agent quality improvement, giving the team clear signals on where to invest.
Architect and implement reusable agent infrastructure: multi-turn conversation workflows, recommendation services, LLM DAGs, and standardized agent topology patterns using LangGraph.
Build and scale RAG pipelines and retrieval infrastructure, including vector store management and retrieval quality optimization.
Make principled build vs. buy decisions across LLM providers, agent frameworks, and evaluation tooling, balancing capability, cost, latency, and vendor risk.
Contribute to production AI systems with a strong focus on reliability, observability, and performance, not just prototypes.
Own projects end-to-end: scope them, drive them to completion, and bring in the right people at the right time.
Partner with engineering leads and managers to inform technical direction on agent quality and evaluation strategy you’ll be expected to hold intelligent, substantive conversations about methodology, not just implementation.
Raise the AI engineering bar across the broader team through code review, documentation, and thoughtful technical debate.

Requirements

What you’ll need

5+ years of professional software engineering experience with significant time spent on production AI/ML systems.
Deep hands-on experience with LLM-based systems: prompt engineering, RAG pipelines, agent orchestration, evaluation metrics, and model fine-tuning.
Proven ability to work with data and understand statistics, especially in experiments.
Proven ability to build and operate agentic AI systems in production: multi-step workflows, multi-agent topologies, and the failure modes that come with them.
Strong command of AI evaluation: you’ve built eval frameworks before, you know the difference between a good eval and a vanity metric, and you have opinions about it.
Production-grade Python engineering: clean, maintainable, testable code.
LangGraph or comparable agent orchestration frameworks.
LangSmith or comparable LLM observability tooling for tracing, evaluation, and debugging.
Reads AI papers & blogs regularly and is a trusted source of AI trends.
Vector databases (Pinecone or similar) and retrieval system design.
AWS ecosystem or other cloud infrastructure (ex GCP). Comfortable with lambdas, queues, and cloud-native architecture.
Familiarity with TypeScript is a plus.

Benefits

Comp & perks

Medical insurance
Dental insurance
Vision insurance
Life, AD&D, and Disability Insurance
Emergency Weather Support
Wellness Apps
Paid Parental Leave
Paid Time off inclusive of holidays and sick time
Commuter & Parking Accounts
Lunches in the Office
Internet and Phone Stipend
401(k) retirement plan
Financial Planning
Learning & Development Budget