Responsible for designing, constructing, installing, and maintaining large-scale processing systems and other infrastructure.
Ensure that data, whether structured or unstructured, is easily accessible and usable to analysts.
Build ETL tools, migrating legacy systems to modern data ecosystems, and handling FHIR resources in healthcare data environments.
Design data pipelines, optimizing data processing, and deliver actionable insights.
Manage GCP services like BigQuery, Dataflow, Pub/Sub, and Cloud storage to deliver business-critical insights.
Possess strong cloud-based-data engineering, hands-on experience with GCP services.
Optimize existing workflows for performance, scalability and cost-efficiency.
Design and implement data pipelines using tools like Apache Beam, Dataflow or Cloud Composer (Airflow).
Develop, optimize and manage large-scale ETL/ELT workflow and processes on GCP.
Utilize BigQuery for data warehousing and analytics, writing complex SQL queries for reporting and analysis.
Build and maintain real-time data streaming solution using Pub/Sub and Dataflow.
Implement best practices for data security, governance and compliance (IAM roles, encryption).
Manage and maintain GCP storage systems like Cloud Storage, ensuring high availability and scalability.
Monitor and troubleshoot data pipelines and workflows, ensuring reliability and performance.
This position will require 5% domestic travel to client sites. In addition, relocation is not required for this position. Position may work at various and unanticipated worksites throughout the United States. Telecommuting permitted.
Requirements
Requires Bachelor’s degree, or foreign equivalent, in Computer Science, Engineering, or a directly related field plus eight (8) years of experience in data engineering in building and maintaining large-scale data solution.
Must have five (5) years of experience in the following: Google Cloud Platform services such as Dataflow, Dataproc, Pub/Sub, Cloud Storage, BigQuery, and Cloud Composer; Managing cloud-based data warehouses and optimizing cost and performance; Apache Beam and its integration with GCP Dataflow for batch and stream processing; Handling Cloud Storage buckets for managing raw and processed data, including lifecycle policies and data retention; Building, testing and optimizing ETL pipelines for large-scale data processing.
Must have three (3) years of experience in the following: Handling big data environments and Hadoop ecosystem.
Must have two (2) years of experience in the following: Working with FHIR standards and healthcare data interoperability; HL7 standards, FHIR data modeling and healthcare data exchange protocols.
Employer will also accept a master’s degree plus five (5) years of experience in data engineering in building and maintaining large-scale data solution role in lieu of a Bachelor’s degree plus eight (8) years of experience in data engineering in building and maintaining large-scale data solution role.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data engineeringETLdata pipelinesSQLbig dataApache Beamdata warehousingdata processingdata modelinghealthcare data interoperability