PandaDoc

Machine Learning Engineer – Document Intelligence, Applied GenAI

PandaDoc

full-time

Posted on:

Location Type: Remote

Location: Germany

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • Build and maintain evaluation frameworks for document models, LLMs, OCR, and structured extraction.
  • Define metrics, benchmarks, and validation strategies for real-world document workloads.
  • Design and curate high-quality datasets for supervised training, fine-tuning, and validation.
  • Create scalable preprocessing pipelines for PDFs, scans, images, forms, and semi-structured documents.
  • Train and fine-tune transformer-based OCR, VLMs, layout models, and open-source LLMs for document understanding tasks.
  • Optimize models for reliability, accuracy, and cost efficiency in production environments.
  • Deploy ML models with modern inference runtimes (vLLM, TGI, TensorRT, ONNX Runtime).
  • Build guardrails, monitoring, and fallback mechanisms to ensure safe and predictable model behavior.
  • Develop retrieval and chunking strategies tailored to document structures (tables, forms, multi-page PDFs).
  • Optimize end-to-end RAG pipelines for semantic search, Q&A, and workflow automation.
  • Partner with PMs, backend engineers, and product designers to define AI opportunities and translate requirements into technical solutions.

Requirements

  • 5+ years of Python experience
  • Experience training, fine-tuning, and deploying traditional computer vision models for document intelligence tasks (layout detection, table extraction, OCR, information extraction)
  • Hands-on experience with document understanding frameworks and models:
  • Traditional document AI models (LayoutLM, Donut, DocFormer)
  • Modern vision-language models with OCR capabilities (DeepSeek-OCR, LightOnOCR-1B, etc.)
  • Experience deploying and optimizing models using inference frameworks such as vLLM (preferred), TGI, TensorRT, or ONNX Runtime
  • Experience applying LLMs to document intelligence workflows, including both frontier models and open-source alternatives
  • Strong understanding of coordinate systems and spatial reasoning for absolute positioning and field detection in forms/documents.
Benefits
  • An honest, open culture that emphasizes feedback and promotes professional and personal development
  • An opportunity to work from anywhere — our team is distributed worldwide, from Lisbon to Manila, from Florida to California
  • 6 self care days
  • A competitive salary
  • And much more!
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Pythondocument modelsLLMsOCRstructured extractiontransformer-based modelslayout detectiontable extractioninformation extractionspatial reasoning