Tech Stack
AirflowAmazon RedshiftAzureBigQueryCloudETLKafkaPythonSparkSQLTerraformVault
About the role
- Design, build, and maintain ETL/ELT pipelines and data workflows (e.g., Azure Data Factory, Databricks, Spark, ClickHouse, Airflow, etc.)
- Develop and optimize data models, data warehouse/lake/lakehouse schema (partitioning, indexing, clustering, cost/performance tuning, etc.)
- Build scalable batch and streaming processing jobs (Spark/Databricks, Delta Lake; Kafka/Event Hubs a plus)
- Ensure data quality, reliability, and observability (tests, monitoring, alerting, SLAs)
- Implement CI/CD and version control for data assets and pipelines
- Secure data and environments (IAM/Entra ID, Key Vault, strong tenancy guarantees, encryption, least privilege)
- Collaborate with application, analytics, and platform teams to deliver trustworthy, consumable datasets.
Requirements
- ETL or ELT experience required (ADF/Databricks/dbt/Airflow or similar)
- Big data experience required
- Cloud experience required; Azure preferred (Synapse, Data Factory, Databricks, Azure Storage, Event Hubs, etc.)
- Strong SQL and performance tuning expertise; hands-on with at least one warehouse/lakehouse (Synapse/Snowflake/BigQuery/Redshift or similar)
- Solid data modeling fundamentals (star/snowflake schemas, normalization/denormalization, CDC, etc.)
- Experience with CI/CD, Git, and infrastructure automation basics
- Nice to Have Streaming pipelines (Kafka, Event Hubs, Kinesis, Pub/Sub) and exactly-once/at-least-once patterns
- Orchestration and workflow tools (Airflow, Prefect, Azure Data Factory)
- Python for data engineering
- Data governance, lineage, and security best practices
- Infrastructure as Code (Terraform) for data platform provisioning.
- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ETLELTdata modelingSQLperformance tuningPythoninfrastructure as codedata governancestreaming pipelinesdata quality
Soft skills
collaborationcommunication