
Senior Data Scientist
LexisNexis
full-time
Posted on:
Location Type: Hybrid
Location: California, Virginia • 🇺🇸 United States
Visit company websiteSalary
💰 $102,800 - $171,300 per year
Job Level
Senior
Tech Stack
Python
About the role
- Join our team to help build state-of-the-art research tools.
- Responsible for the end‑to‑end design and continuous evolution of a multimodal document understanding and structured data extraction platform: complex PDF / scanned page layout analysis, semantic extraction, structural reconstruction, quality validation, and business integration.
- Leads multimodal model strategy (vision + language + layout) and multi‑agent collaboration (task decomposition, verification, conflict reconciliation, feedback loops) and plans future customized training and ongoing optimization of models.
- Design and iterate the multimodal document parsing pipeline: layout / structural modeling, semantic extraction, cross‑modal alignment, structural reconstruction.
- Build and optimize a multi‑agent collaboration mechanism: task splitting, parallel / sequential scheduling, peer review, iterative quality improvement loops.
- Define model selection / composition / routing strategies (dynamic dispatch by document type, structural patterns, quality signals).
- Plan and execute model fine‑tuning, domain adaptation, continual learning, active learning, and data feedback loops.
- Establish end‑to‑end metrics: extraction accuracy, structural consistency, agent collaboration effectiveness, latency, stability, and cost.
- Build quality assurance and risk controls: drift & anomaly monitoring, confidence estimation, fallback strategies, alignment / compliance checks.
- Drive mapping and consistency between agent / model outputs and business knowledge field standards.
Requirements
- Education: Master’s degree or above in a quantitative or technical field (Statistics, Computer Science, Mathematics, Data Science, etc.)
- Experience: 5+ years of hands‑on machine learning / data science experience.
- Proven delivery experience in multimodal (vision + text) or complex document understanding.
- Practical cases of orchestrating agents (or modular processing logic) in production workflows.
- Solid foundation in machine learning / deep learning fundamentals, multimodal representations, and cross‑modal alignment concepts.
- Deep understanding of core principles and common algorithms for multimodal large models: cross‑modal attention & representation alignment, vision/text embedding fusion, hierarchical & layout structure modeling, instruction & contrastive paradigms, long‑context and retrieval‑augmented mechanisms, evaluation and failure mode dissection.
- Familiar with classic image and signal processing methods: edge & contour detection, filtering & denoising, morphological operations, segmentation & key point feature extraction, frequency / time‑frequency analysis, image enhancement & quality assessment; understands trade‑offs and complementarity with deep features.
- Knowledge of multi‑agent collaboration patterns: role assignment, task routing, feedback loops, redundancy & cross‑checks.
- Strong in statistical analysis & experimental design: hypothesis testing, factorial design, power analysis, A/B and multivariate evaluation.
- Able to decompose complex problems and build metric‑driven optimization paths.
- Rigorous in data quality & error analysis; rapid bottleneck identification.
- Ability to translate research pseudo‑code into maintainable, testable Python modules with benchmarking & regression harnesses.
Benefits
- Health Benefits: Comprehensive, multi-carrier program for medical, dental and vision benefits
- Retirement Benefits: 401(k) with match and an Employee Share Purchase Plan
- Wellbeing: Wellness platform with incentives, Headspace app subscription, Employee Assistance and Time-off Programs
- Short-and-Long Term Disability, Life and Accidental Death Insurance, Critical Illness, and Hospital Indemnity
- Family Benefits, including bonding and family care leaves, adoption and surrogacy benefits
- Health Savings, Health Care, Dependent Care and Commuter Spending Accounts
- In addition to annual Paid Time Off, we offer up to two days of paid leave each to participate in Employee Resource Groups and to volunteer with your charity of choice
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
machine learningdata sciencemultimodal document understandingdeep learningcross-modal alignmentstatistical analysisexperimental designPythonquality assurancedata feedback loops
Soft skills
leadershipproblem decompositionmetric-driven optimizationcollaborationcommunicationtask routingfeedback loopsquality improvementrisk managementbottleneck identification