Proximity Works

Senior Data Scientist – LLMs, RAG, Multimodal AI

Proximity Works

full-time

Posted on:

Origin:  • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

PythonPyTorchTensorflow

About the role

  • Design, fine-tune, and optimize LLMs for applied multimodal generation use cases.
  • Build and productionize RAG pipelines that combine embedding-based search, metadata filtering, and LLM-driven re-ranking/summarization.
  • Apply prompt engineering, RAG techniques, and model distillation to improve grounding, reduce hallucinations, and ensure output reliability.
  • Define and implement evaluation metrics across semantic search (nDCG, Recall@K, MRR) and generation quality (grounding accuracy, hallucination rate).
  • Optimize inference pipelines for latency-sensitive use cases with strategies like token budgeting, prompt compression, and sub-100ms response targets.
  • Train and adapt models via transfer learning, LoRA/QLoRA, and checkpoint reloading, ensuring robust deployment in production environments.
  • Collaborate with product and research teams to explore innovative multimodal integrations for user-facing applications.

Requirements

  • Strong background in NLP, machine learning, and multimodal AI.
  • Proven hands-on experience in LLM fine-tuning, RAG, distillation, and evaluation of foundation models.
  • Expertise in semantic search and retrieval pipelines (e.g., FAISS, Weaviate, Vespa, Pinecone).
  • Demonstrated ability to deploy models at scale, including distributed inference setups.
  • Solid understanding of evaluation frameworks for ranking, retrieval, and generation.
  • Proficiency in Python, PyTorch/TensorFlow, and modern ML toolkits.
  • Experience in multimodal AI (bridging text, vision, or speech with LLMs).
  • Track record of shipping latency-sensitive AI products.
  • Strong communication skills and the ability to collaborate with cross-functional global teams.