Tech Stack
AWSETLHadoopHBaseHDFSKafkaPySparkPythonScalaSDLCSpark
About the role
- Job Title: Data Engineer
- Location: Cleveland OH, Pittsburgh PA or Dallas TX
- Duration: Full-time
- Domain: Banking
- Skill to be focused: Python, Spark, and Hadoop
- Job duties include analyzing, designing, and coding ETL programs; data pre-processing, extraction, ingestion, quality, normalization and loading; delivering projects in Agile and coordinating across SDLC.
Requirements
- 6-7+ Years experience working in Data Engineering and Data Analysis.
- Hands on Experience in Hadoop Stack of Technologies (Hadoop, PySpark, HBase, Hive, Pig, Sqoop, Scala, Flume, HDFS, Map Reduce).
- Hands on experience with Python & Kafka.
- Good understanding of Database concepts, Data Design, Data Modeling and ETL.
- Hands on in analyzing, designing, and coding ETL programs which involves Data pre-processing, Data Extraction, Data Ingestion, Data Quality, Data Normalization & Data Loading.
- Working experience in delivering projects in Agile Methodology and hands on in Jira.
- Experience in Client Facing Roles with good communication & thought leadership skills to co-ordinate deliverables across the SDLC.
- Good understanding of Machine Learning Models and Artificial Intelligence preferred.
- Good understanding of Data Components, Data Processing & Data Analytics on AWS is good to have.
- Experience with data modeling tools like Erwin is good to have.