Tech Stack
AirflowAWSAzureCloudETLGoogle Cloud PlatformKafkaPythonSparkSQL
About the role
- Design, build and scale the BI data analytics platform from the ground up.
- Build foundations enabling scalable data pipelines and robust data modeling for real-time and batch analytics, ML models and business insights.
- Take full ownership of design and implementation of a scalable and efficient BI data infrastructure, ensuring high performance, reliability and security.
- Lead the design and architecture of the data platform from integration to transformation, modeling, storage and access.
- Build and maintain ETL/ELT pipelines, batch and real-time, to support analytics, reporting and product integrations.
- Establish and enforce best practices for data quality, lineage, observability and governance to ensure accuracy and consistency.
- Integrate modern tools and frameworks such as Airflow, dbt, Databricks, Power BI and streaming platforms.
- Collaborate cross-functionally with product, engineering and analytics teams to translate business needs into data infrastructure.
- Promote a data-driven culture and empower stakeholders with reliable self-service data access.
Requirements
- 5+ years of hands-on experience in data engineering and in building data products for analytics and business intelligence.
- Proven track record of designing and implementing large-scale data platforms or ETL architectures from the ground up.
- Strong hands-on experience with ETL tools and data Warehouse/Lakehouse products (Airflow, Airbyte, dbt, Databricks).
- Experience supporting both batch pipelines and real-time streaming architectures (e.g., Kafka, Spark Streaming).
- Proficiency in Python, SQL, and cloud data engineering environments (AWS, Azure, or GCP).
- Familiarity with data visualization tools like Power BI, Looker, or similar.
- BSc in Computer Science or a related field from a leading university.
- Nice to have: experience working in early-stage projects, building data systems from scratch.
- Nice to have: background in building operational analytics pipelines in which analytical data feeds real-time product business logic.
- Nice to have: hands-on experience with ML model training pipelines.
- Nice to have: experience in cost optimization in modern cloud environments.
- Nice to have: knowledge of data governance principles, compliance, and security best practices.