Tech Stack
AirflowAmazon RedshiftApacheAWSETLPySparkPythonSparkSQLTerraform
About the role
- Own and drive the end-to-end data process for intelligence (business and technical) across all company ventures, including acquiring, processing (ETL), modeling with AI/ML, and reporting.
- Contribute to an automated process that leverages a data warehouse alongside ML/AI on AWS to generate insights.
- Serve as the competitive positioning expert in the blockchain landscape, providing data insights to fulfill stakeholder requirements.
- Architect and build performant and scalable data engineering pipelines using AWS; drive data system design, modeling, data quality, and delivery.
- Work with senior leadership and stakeholders to understand data value, define key metrics for competitive positioning, and gather requirements.
- Collaborate with data engineers, analysts, data scientists, ML and AI engineers to operationalize models and integrate LLM-related logic into data pipelines.
- Provide direct exposure to strategic decision-making, analysing company strategy and new venture planning from a quantitative perspective.
- Use Scrum methodologies to drive prioritization, planning, and execution of tasks for the data intelligence team.
Requirements
- A MSc/PhD in Computer Science, AI, or a related field is a strong plus.
- Typically 7+ years of professional experience in data engineering or data science.
- Expertise in data modeling and data warehousing, including building and optimizing data schemas and data lakes.
- Hands-on experience specifically with AWS data services like Glue, Redshift, S3, Lambda, Athena.
- Advanced proficiency in Python, SQL, PySpark.
- Hands-on experience with big data frameworks like Apache Spark, DBT, workflow orchestration tools like Apache Airflow or AWS Step Functions, and Infrastructure-as-Code like CloudFormation or Terraform.
- Hands-on experience with MLOps in operationalizing machine learning models in production.
- Experience leading and mentoring data engineers and driving data system projects from conception to completion.
- Solid critical thinking, research, and problem-solving skills, with the ability to quickly grasp any data-related domain.
- Solid foundation of large-scale data systems, including data warehousing and data processing, including parallel processing, data partitioning, cost efficiency.
- Solid data programming covering algorithms, data structures, design patterns, and relational databases.
- Solid software development practices, version control, testing, and CI/CD.
- Knowledge of Machine Learning (ML), Natural Language Processing (NLP), or Deep Learning (DL).
- Knowledge of the Blockchain domain and passionate about gaining further breadth and depth of the domain.