GeneDx

Senior Data Warehouse Engineer

GeneDx

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $153,000 - $191,000 per year

Job Level

Senior

Tech Stack

AirflowAWSCloudETLGoogle Cloud PlatformJavaKubernetesPythonScalaSparkSQLTerraform

About the role

  • Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured data.
  • Contribute to and maintain the enterprise data model – the source of truth in our Snowflake warehouse.
  • Write and optimize complex SQL queries (including window functions, temp tables, and query performance tuning) to support analytics and reporting needs.
  • Take part in designing and maintaining centralized model layer.
  • Support data warehousing solutions via Snowflake + dbt.
  • Develop automation scripts in Bash, Python, or other programming languages.
  • Manage cloud environments (AWS, OCI) in collaboration with infrastructure teams.
  • Maintain and optimize Kubernetes (EKS) cluster for scalable workloads.
  • Implement and maintain infrastructure-as-code using tools like Terraform, YAML, and Argo for reproducible and reliable deployments.
  • Debug and troubleshoot data pipelines and data quality issues across systems.
  • Collaborate with stakeholders of varying technical backgrounds to translate business requirements into scalable technical solutions.
  • Be an active contributor to our ETL/ELT framework, proposing improvements and optimizations.
  • Contribute to best practices for data modeling, governance, and quality control.
  • Explore and recommend AI tools and modern data solutions for efficiency and automation.

Requirements

  • Strong understanding of data engineering concepts and data warehousing fundamentals.
  • Advanced SQL skills, including debugging and performance tuning.
  • Proficiency in at least one general-purpose programming language (e.g., Python, Java, Scala); Python used by team.
  • Familiarity with Kimball (Dimensional) Modeling.
  • Basic scripting knowledge (Bash) for automation and operational workflows.
  • Familiarity with cloud platforms (AWS, GCP, or OCI).
  • Solid communication and collaboration skills to work effectively with technical and non-technical stakeholders.
  • Familiarity with Git.
  • Preferred: Experience with distributed computing frameworks such as Dask or Spark.
  • Preferred: Hands-on experience managing and deploying workloads in Kubernetes.
  • Preferred: Exposure to infrastructure-as-code (Terraform, Helm, Argo, etc.).
  • Preferred: Experience with workflow orchestration systems (Airflow, Dagster, Argo Workflows, etc.).
  • Preferred: Experience implementing Change Data Capture (CDC) pipelines.
  • Preferred: Strong debugging and problem-solving skills for troubleshooting complex data issues.
  • Preferred: Knowledge of AI tools and when to apply them in a data engineering context.