Lead AI Engineer

EXL

full-time

Posted on: 12/30/2025

Location Type: Remote

Location: United States

Visit company website

Explore more

✨ AI Apply

Apply

Job Level

Senior

Tech Stack

Flask Google Cloud Platform Python PyTorch Scikit-Learn Tensorflow

About the role

Design and implement Retrieval-Augmented Generation pipelines to ground LLMs in enterprise or domain-specific data.
Make strategic decisions on chunking strategy, embedding models, and retrieval mechanisms to balance context precision, recall, and latency.
Work with vector databases (Qdrant, Weaviate, pgvector, Pinecone) and embedding frameworks (OpenAI, Hugging Face, Instructor, etc.).
Diagnose and iterate on challenges like chunk size trade-offs, retrieval quality, context window limits, and grounding accuracy—using structured evaluation and metrics.
Establish comprehensive evaluation frameworks for LLM applications, combining quantitative (BLEU, ROUGE, response time) and qualitative methods (human evaluation, LLM-as-a-judge, relevance, coherence, user satisfaction).
Implement continuous monitoring and automated regression testing using tools like LangSmith, LangFuse, Arize, or custom evaluation harnesses.
Identify and prevent quality degradation, hallucinations, or factual inconsistencies before production release.
Collaborate with design and product to define success metrics and user feedback loops for ongoing improvement.
Implement multi-layered guardrails across input validation, output filtering, prompt engineering, re-ranking, and abstention (“I don’t know”) strategies.
Use frameworks such as Guardrails AI, NeMo Guardrails, or Llama Guard to ensure compliance, safety, and brand integrity.
Design and operate multi-agent workflows using orchestration frameworks such as LangGraph, AutoGen, CrewAI, or Haystack.
Coordinate routing logic, task delegation, and parallel vs. sequential agent execution to handle complex reasoning or multi-step tasks.

Requirements

10+ years of experience in Data Science, Data Engineering, or Machine Learning.
Bachelor’s Degree in Computer Science, Information Systems, or a related field.
Proficiency in Python (FastAPI, Flask, asyncio), GCP experience is good to have
Demonstrated hands-on RAG implementation experience with specific tools, models, and evaluation metrics.
Practical knowledge of agentic frameworks (LangGraph, LangChain) and evaluation ecosystems (LangFuse, LangSmith).
Excellent communication skills, proven ability to collaborate cross-functionally, and a low-ego, ownership-driven work style.
Experience in traditional AI/ML workflows — e.g., model training, feature engineering, and deployment of ML models (scikit-learn, TensorFlow, PyTorch).
Familiarity with retrieval optimization, prompt tuning, and tool-use evaluation.
Background in observability and performance profiling for large-scale AI systems.
Understanding of security and privacy principles for AI systems (PII redaction, authentication/authorization, RBAC)
Exposure to enterprise chatbot systems, LLMOps pipelines, and continuous model evaluation in production.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonFastAPIFlaskasyncioData ScienceData EngineeringMachine LearningRAG implementationmodel trainingfeature engineering

Soft Skills

communication skillscollaborationownership-driven work stylecross-functional teamwork

Certifications

Bachelor’s Degree in Computer ScienceBachelor’s Degree in Information Systems