
Lead AI Engineer
EXL
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteJob Level
Senior
Tech Stack
FlaskGoogle Cloud PlatformPythonPyTorchScikit-LearnTensorflow
About the role
- Design and implement Retrieval-Augmented Generation pipelines to ground LLMs in enterprise or domain-specific data.
- Make strategic decisions on chunking strategy, embedding models, and retrieval mechanisms to balance context precision, recall, and latency.
- Work with vector databases (Qdrant, Weaviate, pgvector, Pinecone) and embedding frameworks (OpenAI, Hugging Face, Instructor, etc.).
- Diagnose and iterate on challenges like chunk size trade-offs, retrieval quality, context window limits, and grounding accuracy—using structured evaluation and metrics.
- Establish comprehensive evaluation frameworks for LLM applications, combining quantitative (BLEU, ROUGE, response time) and qualitative methods (human evaluation, LLM-as-a-judge, relevance, coherence, user satisfaction).
- Implement continuous monitoring and automated regression testing using tools like LangSmith, LangFuse, Arize, or custom evaluation harnesses.
- Identify and prevent quality degradation, hallucinations, or factual inconsistencies before production release.
- Collaborate with design and product to define success metrics and user feedback loops for ongoing improvement.
- Implement multi-layered guardrails across input validation, output filtering, prompt engineering, re-ranking, and abstention (“I don’t know”) strategies.
- Use frameworks such as Guardrails AI, NeMo Guardrails, or Llama Guard to ensure compliance, safety, and brand integrity.
- Design and operate multi-agent workflows using orchestration frameworks such as LangGraph, AutoGen, CrewAI, or Haystack.
- Coordinate routing logic, task delegation, and parallel vs. sequential agent execution to handle complex reasoning or multi-step tasks.
Requirements
- 10+ years of experience in Data Science, Data Engineering, or Machine Learning.
- Bachelor’s Degree in Computer Science, Information Systems, or a related field.
- Proficiency in Python (FastAPI, Flask, asyncio), GCP experience is good to have
- Demonstrated hands-on RAG implementation experience with specific tools, models, and evaluation metrics.
- Practical knowledge of agentic frameworks (LangGraph, LangChain) and evaluation ecosystems (LangFuse, LangSmith).
- Excellent communication skills, proven ability to collaborate cross-functionally, and a low-ego, ownership-driven work style.
- Experience in traditional AI/ML workflows — e.g., model training, feature engineering, and deployment of ML models (scikit-learn, TensorFlow, PyTorch).
- Familiarity with retrieval optimization, prompt tuning, and tool-use evaluation.
- Background in observability and performance profiling for large-scale AI systems.
- Understanding of security and privacy principles for AI systems (PII redaction, authentication/authorization, RBAC)
- Exposure to enterprise chatbot systems, LLMOps pipelines, and continuous model evaluation in production.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonFastAPIFlaskasyncioData ScienceData EngineeringMachine LearningRAG implementationmodel trainingfeature engineering
Soft skills
communication skillscollaborationownership-driven work stylecross-functional teamwork
Certifications
Bachelor’s Degree in Computer ScienceBachelor’s Degree in Information Systems