Staff AI Engineer – AI Transformation

Multiverse

. Own the architecture of our internal agentic operating system.

Posted 6/18/2026full-timeLondon • 🇬🇧 United KingdomLeadWebsite

ATS Keywords

Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills

multi-agent systemsAI productscontext managementmodel selectioncost engineeringtool useevaluation frameworksfull-stack deliveryprompt optimisationautomated pipelines

Soft Skills

product thinkingentrepreneurial instinctcommunicationmentorshiptechnical leadership

Tools & Technologies

APIsMCPsdata pipelinesClaude Codehuman-in-the-loop systems

Industry Keywords

agent orchestrationevaluation frameworktask decompositionshared state managementcost/quality trade-offs

About the role

Key responsibilities & impact

Own the architecture of our internal agentic operating system. The team's work spans the full surface of how Multiverse operates. You own the technical architecture of our agentic operating system: the agent orchestration, context strategy, tool integrations, evaluation framework, and production operation. Your design decisions shape what is possible for human and AI teams at Multiverse
Ship production AI agent systems. This is a building role. You write code, review code, and own the quality of what goes to production. You will personally build and deliver significant agent systems. On a squad this size, nobody leads from a whiteboard.
Design multi-agent coordination. Task decomposition across agents, handoff protocols, shared state management, orchestration logic. You know the difference between agents that genuinely coordinate and agents that run sequentially and hope for the best. You design the patterns that make multi-agent systems reliable.
Build the evaluation and quality infrastructure. Automated eval pipelines, human-in-the-loop review systems, regression testing for prompt changes, domain-specific quality metrics. You treat evaluation as a first-class engineering concern and build the systems that make it possible at scale.
Drive cost engineering. Token economics, caching strategies, model routing, prompt optimisation. The cost profile of production AI systems requires active engineering attention, and you build the cost awareness and tooling into the architecture rather than bolting it on later.
Build the integration layer that makes existing Multiverse systems agent-accessible. APIs, MCPs, shared data contracts, and the tooling that connects agents to the platform, content systems, and the tools the company runs on. This means building real working relationships with engineering teams across London and designing interfaces that serve both sides well.
Set the standard. You define patterns for prompt management, retrieval, guardrails, and testing that the wider team and eventually the whole organisation adopts — and that, in time, shape how the companies who learn from Multiverse do this too. You do this through code, documentation, and architectural decisions, not through mandates.
Mentor the team. Code review, architectural guidance, pairing on the hardest problems. You are not a line manager, but your technical leadership directly shapes the growth of the engineers around you.

Requirements

What you’ll need

You have shipped multi-agent systems or complex AI products to real users. You understand the engineering challenges that make agent systems a distinct discipline:
Context management. Designing what enters the context window and what stays out. Retrieval strategies, chunking, conversation memory, summarisation, and the cost/quality trade-offs of each. You have made these decisions in production and seen the consequences.
Model selection and routing. Choosing the right model for each task based on capability, latency, cost, and reliability. Building routing logic that matches work to the appropriate model rather than defaulting to one.
Cost engineering. Token economics, caching, prompt optimisation, batching. You know the difference between a prototype that works and a production system that works at sustainable cost. You have built systems where cost was an engineering constraint, not someone else's problem.
Tool use and agent augmentation. Designing what capabilities agents can reach: tool descriptions that models use reliably, failure handling, MCPs or equivalent interfaces. You understand that the quality of the tool layer determines whether agents are useful or fragile.
Multi-agent coordination. Task decomposition across agents, handoff protocols, shared state, orchestration logic. You have built systems where multiple agents work together within a product domain and understand the architectural patterns that make coordination reliable.
Evaluation and quality. Building eval frameworks for AI output: accuracy, helpfulness, safety, domain-specific criteria. Automated pipelines and human-in-the-loop review. You would not ship an agent system without a quality baseline.
Product Thinking and Entrepreneurial Instinct. On a small squad there is no gap between product thinking and engineering. You own the problem from user need to production system. You can sit with the people whose work you are transforming, understand their workflow, identify the highest-value intervention, and build it without waiting for a product manager to write a spec.
You have either built something yourself (a product, a startup, a project with real users) or operated with that founder mindset inside a larger organisation. You understand that speed matters and that shipping something useful beats polishing something theoretical.
AI-Native Engineering. You build with Claude Code daily. You set context and constraints before generating code. You review AI output critically. You augment the tool with skills, system prompts, and domain context to make it effective. This is how the team works, and you help define what good looks like.
Full-Stack Delivery. You work across the stack: LLM integration, backend services, data pipelines, and enough frontend to ship end to end. The boundaries between these layers dissolve in agent systems, and so should your willingness to work across them.
Communication. You can explain technical strategy to a CPO, walk a product manager through a cost trade-off, and give direct feedback in code review. You represent the team's technical approach in cross-functional forums with product, design, learning design, compliance, and other engineering teams. You document decisions, not just code.

Benefits

Comp & perks

Time off - 27 days holiday, plus 5 additional days off: 1 life event day, 2 volunteer days, 2 company-wide wellbeing days (M-Powered Weekend) and 8 bank holidays per year
Health & Wellness- private medical Insurance with Bupa, a medical cashback scheme, life insurance, gym membership & wellness resources through Wellhub and access to Spill - all in one mental health support
Hybrid work offering - for most roles we collaborate in the office three days per week with the exception of Coaches and Instructors who collaborate in the office once a month
Work-from-anywhere scheme - you'll have the opportunity to work from anywhere, up to 10 days per year
Space to connect: Beyond the desk, we make time for weekly catch-ups, seasonal celebrations, and have a kitchen that’s always stocked!