Salary
💰 $105,400 - $140,000 per year
Tech Stack
AWSAzureCloudETLGoogle Cloud PlatformInformaticaPostgresPythonSQL
About the role
- Design, develop, and maintain robust data pipelines and architectures using modern ETL/ELT frameworks.
- Integrate and standardize healthcare data from diverse sources including EHR, claims, lab systems, and patient portals.
- Collaborate with Data Architects to implement scalable models that support BI, analytics, and data science initiatives.
- Build automated ETL workflows that ensure high performance, reliability, and data integrity; monitor data jobs and troubleshoot issues.
- Document pipelines, technical processes, transformation steps, and ensure proper logging and transparency into pipeline performance.
- Shape data from raw intake into usable staging for BI programs and analytics platforms; work with Backend Engineers on API data requirements.
- Work in multiple existing database structures to understand table schema and build queries for data extraction and forensic discovery.
- Contribute to enterprise data strategy, translate analytics needs into engineering deliverables, and support client data queries and record requests.
- Ensure data quality, accuracy, and security; partner with governance teams to implement metadata management, data lineage, and stewardship practices; maintain compliance with HIPAA, HITECH, and internal policies.
Requirements
- Bachelor’s degree in Computer Science, Information Systems, Health Informatics, or related field preferred.
- 5+ years of experience in data engineering, preferably in a healthcare or regulated industry, required.
- Experience with cloud platforms such as Azure, AWS, or GCP.
- Knowledge of data governance frameworks and tools (e.g., Collibra, Informatica).
- Exposure to DevOps, CI/CD pipelines, and Agile development practices.
- Familiarity with healthcare data formats (HL7, FHIR, CCD), structures, and compliance requirements.
- Expertise in ETL development and tools (e.g., Python, Azure Data Factory, etc.).
- Experience in data transformation models (e.g., dbt, T-SQL, etc.).
- Proficiency in SQL, Python, YAML, JSON.
- Comfortable with multiple varied data formats and the pros / cons of each (e.g., CSV, Feather, Parquet, etc.).
- Proficiency in multiple database structures (e.g., MS SQL, Postgres, Snowflake, etc.).
- Must be fully vaccinated against Covid-19 (requests for accommodation considered).