Conversational AI – Prompt Engineer

ZenBusiness

full-time

Posted on: 3/9/2026

Location Type: Remote

Location: United States

Visit company website

Explore more

Artificial Intelligence jobs

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

Python TypeScript

About the role

Analyze conversation transcripts and user feedback to identify areas of confusion, failure, and prompt leakage.
Work with the Customer Impact Team Product Lead to define and track conversational KPIs (e.g., resolution rate, containment rate, user satisfaction).
Optimize prompts and model selection for cost efficiency, response latency, and scalability in production environments.
Collaborate with the engineers to improve conversation-specific evaluation criteria (e.g., NLU accuracy, intent recognition).
Design and maintain evaluation frameworks to measure prompt performance using golden datasets and automated scoring (e.g., LLM-as-judge, rubric-based scoring, precision/recall of intent routing).
Implement guardrails to reduce hallucinations, prevent prompt injection, and ensure compliant, safe responses.
Collaborate on design, map, and implement complex conversation flows, including error recovery and contextual handoffs (escalation to human support).
Own the continuous optimization of system prompts and instructions for LLMs (Gemini, OpenAI) to ensure Velo's response is accurate, tone is consistent, and on-brand.
Design and optimize structured outputs, function calling, and tool-routing logic to ensure accurate data capture and downstream system integrations.

Requirements

Experience: 5+ years with 2+ years in Conversational AI, Applied LLM Engineering, Prompt Engineering, or NLP systems in production environments.
LLM Expertise: Deep experience designing and optimizing prompts for GPT, Gemini, or similar models, including structured outputs and function calling.
RAG Systems: Practical experience designing and tuning RAG pipelines (chunking, embeddings, retrieval evaluation).
Evaluation: Experience building evaluation datasets and running prompt experiments (A/B testing, automated scoring, regression testing).
Technical: Proficiency in Python or TypeScript; experience integrating LLM APIs in production systems.
Analytics: Ability to analyze conversational performance using data and logs to drive measurable improvements.
Soft Skills: Strong systems thinking, empathy for users, and ability to translate business logic into scalable AI behavior.
Experience With Agentic Systems: Similar to Decagon, Agentforce, Fin, Sierra

Benefits

The company offers various benefits to employees and their dependents, including medical, vision, dental, disability, and life insurance.
Parental and military leave.
Employee assistance program.
401k + match.
Annual bonus.
Pet insurance.
Paid parking*.
10 paid holidays.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Conversational AIApplied LLM EngineeringPrompt EngineeringNLP systemsLLM optimizationRAG pipelinesPythonTypeScriptA/B testingautomated scoring

Soft Skills

systems thinkingempathy for userstranslating business logiccollaborationproblem-solvingcommunicationanalytical thinkingadaptabilityattention to detailcreativity