ARETUM

Data Engineer

ARETUM

full-time

Posted on:

Location Type: Remote

Location: VirginiaUnited States

Visit company website

Explore more

AI Apply
Apply

About the role

  • Ingest data from FHIR APIs, CDW, and other VA sources
  • Normalize and reconcile medication and patient data
  • Build transformation pipelines for risk scoring inputs
  • Support batch and near-real-time processing
  • Ensure data quality, consistency, and traceability

Requirements

  • Programming: Python (primary), SQL (advanced), optional Scala
  • Data Processing Frameworks: Apache Spark, AWS EMR, Databricks (preferred)
  • ETL/ELT Design: Pipeline orchestration, incremental vs full loads, data validation
  • API Integration: REST APIs, JSON parsing, pagination, authentication (OAuth2)
  • FHIR Data Handling: Patient, MedicationRequest, Observation, etc.
  • Data Modeling: Relational and semi-structured schema design
  • Data Quality & Validation: Deduplication, reconciliation logic, anomaly detection
  • Streaming vs Batch Processing: Understanding tradeoffs and implementation patterns
  • Storage Technologies: S3, relational DBs, NoSQL basics
  • Performance Optimization: Partitioning, parallelization, query tuning
  • Versioning & Lineage: Data version control, reproducibility of datasets
Benefits
  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • Family Leave (Maternity, Paternity)
  • Short Term & Long-Term Disability
  • Training & Development
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonSQLScalaApache SparkAWS EMRDatabricksETLAPI IntegrationData ModelingPerformance Optimization
Soft Skills
data qualitydata consistencydata traceabilityanomaly detectionreconciliation logic