Tech Stack
AirflowBigQueryGoogle Cloud PlatformHadoopMySQLNoSQLPySparkPythonSparkSQL
About the role
- Migrate dashboards/reports from Hadoop to GCP, including migrating Hive and Pig jobs to BigQuery/Spark
- Create data processing and modeling applications using programming languages and data engineering tools
- Ensure data consistency and accuracy through data validation and cleansing techniques
- Work with cross-functional teams to identify and address data-related issues
- Implement GCP and Airflow workflows
- Apply security and governance practices such as RBAC and data lineage
Requirements
- 5+ years experience as a Data Engineer
- Experience with GCP and Airflow (GCP & Airflow experience is required)
- Proficiency in Python and PySpark
- Experience with SQL and NoSQL databases
- Experience migrating Hive and Pig jobs to BigQuery/Spark
- Knowledge of database management systems (e.g., MySQL)
- Knowledge of Security & Governance: Role-based access control (RBAC), Data lineage tools
- Strong problem-solving and analytical skills
- Ability to work independently and collaboratively in a fast-paced environment
- Bachelor’s degree in Engineering - Computer science or related field
- Keen interest to learn and adapt