Tech Stack
AirflowAWSAzureCloudETLGoogle Cloud PlatformKafkaPythonSQLTableau
About the role
- We are seeking an experienced and highly motivated Senior/Staff Data Engineer to lead the design, development, and optimization of scalable data solutions in a complex, fast-paced environment. This role will focus on delivering robust, high-performance data pipelines and platforms to support advanced analytics, machine learning, and operational data needs. The ideal candidate will have extensive experience in U.S. healthcare data, with deep knowledge of claims processing, HL7 FHIR standards, and a strong background in modern data engineering technologies.
- Lead the design, development, and maintenance of scalable ETL/ELT pipelines for structured, semi-structured, and unstructured data.
- Architect, implement, and optimize data warehouses and lakehouses leveraging Snowflake and/or Databricks.
- Integrate and process data from diverse sources, including EHR, claims systems, HL7 FHIR APIs, and healthcare-specific datasets.
- Ensure data quality, lineage, and governance in compliance with HIPAA and other healthcare regulations.
- Collaborate with business stakeholders, data scientists, and analysts to translate business requirements into efficient and reliable data solutions.
- Optimize SQL and Python code for performance, scalability, and cost efficiency.
- Implement orchestration, automation, and monitoring frameworks (e.g., Airflow, dbt, Azure Data Factory).
- Lead troubleshooting efforts for data pipeline and infrastructure issues, ensuring system reliability.
- Establish and enforce best practices for data engineering, including CI/CD pipelines, testing, and documentation.
- Mentor junior engineers and contribute to the overall technical strategy for data engineering.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Engineering, or related field.
- 7+ years of professional experience in data engineering, with proven leadership in complex projects.
- Strong expertise in SQL and Python for data processing and automation.
- Proficiency in either Snowflake or Databricks, with a willingness to learn the other.
- Mandatory experience in U.S. healthcare data, including claims processing workflows and regulatory compliance.
- Solid understanding of HL7 FHIR standards, data integration patterns, and interoperability challenges.
- Experience with HIPAA-compliant data solutions and security best practices.
- Cloud platform proficiency (AWS, Azure, or GCP) with associated data services.