Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Geisinger

Senior Platform Data Engineer

Geisinger

Senior Platform Data Engineer managing clinical data pipelines for AI readiness at Geisinger. Owning shared data products and leading data ingestion and transformation efforts.

Posted 4/16/2026full-timeRemote • Pennsylvania • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies
KafkaPandasPySparkPythonSparkSQLUnity

About the role

Key responsibilities & impact
  • The Senior Platform Data Engineer owns roadmap, priorities, platform standards, and architecture reviews; provides formal input on performance reviews.
  • This position makes clinical data ready for AI at scale: owning the shared data products, retrieval infrastructure, and platform administration that the entire AI portfolio depends on.
  • Owns Real-time data feeds. Reusable clinical data models and feature pipelines. RAG retrieval infrastructure (ingestion, chunking, embeddings, vector DB, retrieval pipelines).
  • Streams data from Epic SDE, ADT feeds, lab results, and other clinical sources into Databricks for downstream model consumption.
  • Curates shared clinical feature tables (patient demographics, labs, vitals, diagnoses, utilization history, imaging metadata) in Databricks/Unity Catalog that multiple AI programs consume for model training, validation, and monitoring.
  • Designs and operates document ingestion pipelines: normalizing clinical documents, policies, guidelines, and unstructured data sources into formats ready for embedding and retrieval.
  • Implements and optimizes chunking strategies tailored to healthcare content (e.g., preserving clinical note structure, section-aware chunking for guidelines and protocols).
  • Establishes data quality gates for RAG: automated profiling, completeness checks, and accuracy scoring before content enters the vector store.

Requirements

What you’ll need
  • 5+ years in data engineering, with strong experience building both batch and streaming data pipelines
  • Expert-level Databricks skills: Delta Live Tables, PySpark, Unity Catalog, Feature Store
  • Hands-on experience with real-time data ingestion (Kafka, Spark Structured Streaming, or comparable frameworks)
  • Strong SQL and Python (pandas, PySpark) skills for data transformation and feature engineering
  • Experience administering Databricks workspaces: cluster policies, compute management, access controls, cost monitoring
  • Familiarity with clinical data models and healthcare data sources (EHR extracts, ADT feeds, lab results, claims data) strongly preferred
  • Experience with Epic data extraction methods (SDE, FHIR, epic-ws) a significant plus
  • Understanding of data governance principles: lineage, quality monitoring, access controls.

Benefits

Comp & perks
  • We offer healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners.
  • We encourage an atmosphere of collaboration, cooperation and collegiality.
  • We know that a diverse workforce with unique experiences and backgrounds makes our team stronger.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data engineeringbatch data pipelinesstreaming data pipelinesDatabricksDelta Live TablesPySparkSQLPythondata transformationfeature engineering
Soft Skills
leadershiporganizational skillscommunication