
Machine Learning Engineer – Document Intelligence, Applied GenAI
PandaDoc
full-time
Posted on:
Location Type: Remote
Location: Germany
Visit company websiteExplore more
Tech Stack
About the role
- Build and maintain evaluation frameworks for document models, LLMs, OCR, and structured extraction.
- Define metrics, benchmarks, and validation strategies for real-world document workloads.
- Design and curate high-quality datasets for supervised training, fine-tuning, and validation.
- Create scalable preprocessing pipelines for PDFs, scans, images, forms, and semi-structured documents.
- Train and fine-tune transformer-based OCR, VLMs, layout models, and open-source LLMs for document understanding tasks.
- Optimize models for reliability, accuracy, and cost efficiency in production environments.
- Deploy ML models with modern inference runtimes (vLLM, TGI, TensorRT, ONNX Runtime).
- Build guardrails, monitoring, and fallback mechanisms to ensure safe and predictable model behavior.
- Develop retrieval and chunking strategies tailored to document structures (tables, forms, multi-page PDFs).
- Optimize end-to-end RAG pipelines for semantic search, Q&A, and workflow automation.
- Partner with PMs, backend engineers, and product designers to define AI opportunities and translate requirements into technical solutions.
Requirements
- 5+ years of Python experience
- Experience training, fine-tuning, and deploying traditional computer vision models for document intelligence tasks (layout detection, table extraction, OCR, information extraction)
- Hands-on experience with document understanding frameworks and models:
- Traditional document AI models (LayoutLM, Donut, DocFormer)
- Modern vision-language models with OCR capabilities (DeepSeek-OCR, LightOnOCR-1B, etc.)
- Experience deploying and optimizing models using inference frameworks such as vLLM (preferred), TGI, TensorRT, or ONNX Runtime
- Experience applying LLMs to document intelligence workflows, including both frontier models and open-source alternatives
- Strong understanding of coordinate systems and spatial reasoning for absolute positioning and field detection in forms/documents.
Benefits
- An honest, open culture that emphasizes feedback and promotes professional and personal development
- An opportunity to work from anywhere — our team is distributed worldwide, from Lisbon to Manila, from Florida to California
- 6 self care days
- A competitive salary
- And much more!
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Pythondocument modelsLLMsOCRstructured extractiontransformer-based modelslayout detectiontable extractioninformation extractionspatial reasoning