Tech Stack
AirflowAmazon RedshiftApacheAWSETLPySparkSparkSQL
About the role
- Work as a full-stack AWS Data Engineer solving business problems and building data products
- Collaborate with business leads, analysts, and data scientists to understand business domain
- Write efficient code in PySpark and Amazon Glue and build ETL jobs
- Write and optimize SQL queries in Amazon Athena and Amazon Redshift
- Explore new technologies and learn techniques to solve business problems
- Collaborate with engineering and business teams to build data products and services
- Deliver projects collaboratively and manage timely updates to customers
- Ensure data quality of business metrics and build scalable, flexible solutions
Requirements
- 1 to 3 years of experience in Apache Spark, PySpark, Amazon Glue
- 2+ years of experience in writing ETL jobs using pySpark and SparkSQL
- 2+ years of experience in SQL queries and stored procedures
- Deep understanding of Dataframe API and Spark 2.7+ transformation functions
- Experience writing SQL in Amazon Athena and Amazon Redshift
- Experience with PySpark and Amazon Glue
- Preferred: prior experience with AWS EMR and Apache Airflow
- Preferred: Certifications - AWS Certified Big Data – Specialty OR Cloudera Certified Big Data Engineer OR Hortonworks Certified Big Data Engineer
- Understanding of DataOps Engineering