FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
About the role
Key responsibilities & impact- Sit in on a customer session, understand how their agents are failing, design an eval that captures it, and drive a fix through to shipped improvement.
- Close a piece of the outer loop end to end: production signal in, dataset out, eval scored, harness change shipped, metric moved.
- Own a slice of our eval infrastructure: dataset curation, harness configuration, runner, analysis, and the comms back to engineering.
- Prototype a new harness or context configuration and measure whether it actually moves the needle on real customer tasks.
- Dig through pages of agent traces, build the tooling you need to make sense of them, and brief the team on what you found.
- Partner with product and engineering on near-term shipping problems by bringing research rigour.
- Pull a recent paper apart, work out what's actually transferable to our platform, and turn it into a concrete experiment.
Requirements
What you’ll need- 4+ years shipping AI/ML products in a startup or applied industry setting
- Demonstrated depth in at least one of the four skill areas above
- Strong product and customer instincts: comfort joining customer calls, watching session recordings, and letting real workflows shape what you work on
- Sharp evaluation judgement: benchmarks where they exist, vibes and quick prototypes where they don't, and the taste to know which is appropriate
- Experience building datasets for evaluation or training
- Deeply curious about agents and excited about reshaping how software is built.
- Nice to have: A Masters or PhD in a relevant computational field
- Direct experience with coding agents or code-generation systems
- Background in RL, bandits, or other outer-loop optimisation frameworks applied to LLMs
- Experience building synthetic data, dataset infrastructure, or internal tooling that other engineers actually used
- A project you can show us (GitHub links welcome) and a thoughtful answer to *"Why Tessl?"*
Benefits
Comp & perks- Health insurance extending to partners and dependents
- Pension contributions
- Regular team lunches
- Drinks and socials
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AI productsML productsdataset curationharness configurationevaluationdata analysiscoding agentscode-generation systemsbuilding synthetic dataouter-loop optimization
Soft Skills
customer instinctsevaluation judgementcuriositycollaborationcommunication
Certifications
Masters in computational fieldPhD in computational field
