
Senior Software Engineer, AI
Lattice
full-time
Posted on:
Location Type: Remote
Location: Canada
Visit company websiteExplore more
Salary
💰 CA$123,750 - CA$165,000 per year
Job Level
About the role
- Design and ship a robust, end-to-end AI evaluation framework, covering offline evals, production tracing, and human-in-the-loop feedback loops, connected across all of Lattice’s AI use cases.
- Define and instrument the metrics that actually matter: agent task completion, hallucination rates, response quality, user engagement, and downstream business outcomes.
- Build and maintain evaluation datasets, test harnesses, and automated scoring pipelines to catch regressions before they ship.
- Identify and surface the drivers of agent quality improvement, giving the team clear signals on where to invest.
- Architect and implement reusable agent infrastructure: multi-turn conversation workflows, recommendation services, LLM DAGs, and standardized agent topology patterns using LangGraph.
- Build and scale RAG pipelines and retrieval infrastructure, including vector store management and retrieval quality optimization.
- Make principled build vs. buy decisions across LLM providers, agent frameworks, and evaluation tooling, balancing capability, cost, latency, and vendor risk.
- Contribute to production AI systems with a strong focus on reliability, observability, and performance, not just prototypes.
- Own projects end-to-end: scope them, drive them to completion, and bring in the right people at the right time.
- Partner with engineering leads and managers to inform technical direction on agent quality and evaluation strategy you’ll be expected to hold intelligent, substantive conversations about methodology, not just implementation.
- Raise the AI engineering bar across the broader team through code review, documentation, and thoughtful technical debate.
Requirements
- 5+ years of professional software engineering experience with significant time spent on production AI/ML systems.
- Deep hands-on experience with LLM-based systems: prompt engineering, RAG pipelines, agent orchestration, evaluation metrics, and model fine-tuning.
- Proven ability to work with data and understand statistics, especially in experiments.
- Proven ability to build and operate agentic AI systems in production: multi-step workflows, multi-agent topologies, and the failure modes that come with them.
- Strong command of AI evaluation: you’ve built eval frameworks before, you know the difference between a good eval and a vanity metric, and you have opinions about it.
- Production-grade Python engineering: clean, maintainable, testable code.
- LangGraph or comparable agent orchestration frameworks. You’ve built real agent workflows with it, not just tutorials.
- LangSmith or comparable LLM observability tooling for tracing, evaluation, and debugging.
- Reads AI papers & blogs regularly and is a trusted source of AI trends.
- Vector databases (Pinecone or similar) and retrieval system design.
- AWS ecosystem or other cloud infrastructure (ex GCP). Comfortable with lambdas, queues, and cloud-native architecture.
- Familiarity with TypeScript is a plus.
Benefits
- Medical insurance
- Dental insurance
- Life, AD&D, and Disability Insurance
- Natural Disaster Support Program
- Wellness Apps
- Paid Parental Leave
- Paid Time off inclusive of holidays and sick time
- Working Remotely Stipend
- One time WFH Office Set-Up Stipend
- Retirement Plan
- Financial Planning
- Learning & Development Budget
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AI evaluation frameworkproduction AI/ML systemsLLM-based systemsprompt engineeringRAG pipelinesagent orchestrationevaluation metricsmodel fine-tuningproduction-grade Python engineeringmulti-step workflows
Soft Skills
project ownershiptechnical directioncollaborationcommunicationcritical thinkingproblem-solvingcode reviewdocumentationtechnical debatedata-driven decision making