Tech Stack
AirflowAmazon RedshiftBigQueryETLJavaKafkaPythonScalaSparkSQLTerraform
About the role
- Architect and lead development of large-scale systems for data pipelines and data lakes handling billions of daily events
- Design and implement solutions for data availability, security, scalability across the platform enabling real-time and batch processing
- Make high-level architectural decisions about system design, technology choices, and platform evolution
- Collaborate with product teams, data engineers, back-end developers, and ML engineers to build tools and frameworks for analytics, product features, and workflows
- Work with business stakeholders to build high-impact data products for features, research, and experimentation
- Collaborate with Leadership to devise charter and technical roadmaps
- Mentor engineers across back-end infrastructure and data engineering disciplines
Requirements
- Experience in large-scale distributed computing systems, data engineering and data modeling
- Experience managing live production environments, high-load systems or data-intensive workflows
- Expert in SQL
- Proficient in Spark, Kafka, Terraform, and at least one programming language (Python, Scala, or Java)
- Strong knowledge of ETL/ELT design patterns, orchestration tools (e.g., Airflow, dbt, Dagster), and data quality frameworks
- Ability to design scalable, secure, and maintainable data models and architectures, understanding of data governance, GDPR/CCPA
- Hands-on experience with modern data storage technologies (e.g., Delta Lake, Snowflake, BigQuery, Redshift)
- Strategic thinker evaluating build vs third-party solutions
- Excellent communicator and collaborator
- Ability to work independently with minimal guidance
- Embodies EAGER values and MOVE principles