Senior Data Warehouse Engineer

GeneDx

full-time

Posted on: 9/9/2025

Origin: • 🇺🇸 United States

✨ AI Apply

💰 $153,000 - $191,000 per year

Senior

AirflowAWSCloudETLGoogle Cloud PlatformJavaKubernetesPythonScalaSparkSQLTerraform

About the role

Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured data.
Contribute to and maintain the enterprise data model – the source of truth in our Snowflake warehouse.
Write and optimize complex SQL queries (including window functions, temp tables, and query performance tuning) to support analytics and reporting needs.
Take part in designing and maintaining centralized model layer.
Support data warehousing solutions via Snowflake + dbt.
Develop automation scripts in Bash, Python, or other programming languages.
Manage cloud environments (AWS, OCI) in collaboration with infrastructure teams.
Maintain and optimize Kubernetes (EKS) cluster for scalable workloads.
Implement and maintain infrastructure-as-code using tools like Terraform, YAML, and Argo for reproducible and reliable deployments.
Debug and troubleshoot data pipelines and data quality issues across systems.
Collaborate with stakeholders of varying technical backgrounds to translate business requirements into scalable technical solutions.
Be an active contributor to our ETL/ELT framework, proposing improvements and optimizations.
Contribute to best practices for data modeling, governance, and quality control.
Explore and recommend AI tools and modern data solutions for efficiency and automation.

Strong understanding of data engineering concepts and data warehousing fundamentals.
Advanced SQL skills, including debugging and performance tuning.
Proficiency in at least one general-purpose programming language (e.g., Python, Java, Scala); Python used by team.
Familiarity with Kimball (Dimensional) Modeling.
Basic scripting knowledge (Bash) for automation and operational workflows.
Familiarity with cloud platforms (AWS, GCP, or OCI).
Solid communication and collaboration skills to work effectively with technical and non-technical stakeholders.
Familiarity with Git.
Preferred: Experience with distributed computing frameworks such as Dask or Spark.
Preferred: Hands-on experience managing and deploying workloads in Kubernetes.
Preferred: Exposure to infrastructure-as-code (Terraform, Helm, Argo, etc.).
Preferred: Experience with workflow orchestration systems (Airflow, Dagster, Argo Workflows, etc.).
Preferred: Experience implementing Change Data Capture (CDC) pipelines.
Preferred: Strong debugging and problem-solving skills for troubleshooting complex data issues.
Preferred: Knowledge of AI tools and when to apply them in a data engineering context.