Value1

Data Engineer, PySpark

Value1

full-time

Posted on:

Origin:  • 🇮🇳 India

Visit company website
AI Apply
Manual Apply

Job Level

Mid-LevelSenior

Tech Stack

CloudGoogle Cloud PlatformHadoopPySparkPythonShell ScriptingSQLUnix

About the role

  • Develop and maintain PySpark data pipelines and transformations
  • Perform data cleaning, feature engineering, and build statistical/ML models
  • Write SQL for data analytics and data transformations
  • Work on Unix-based platforms including shell scripting and scheduling cron jobs
  • Work within big data ecosystem tools such as Hadoop and Hive and use GitHub for version control
  • Use GCP for cloud-based data solutions, data modelling, and data quality assessment and control
  • Collaborate on projects involving banking and financial services data and adapt to changing technical environments

Requirements

  • 5+ years of experience in Data Analytics role with at least 2 years of development experience in PySpark
  • Very strong in SQL
  • Expertise in programming with Python with experience in Data Cleaning, Feature Engineering, Transformation and building statistical/ML models
  • Experience working on Unix based platforms with basic knowledge of shell scripting, writing Cron jobs etc.
  • Knowledge of big data ecosystem with knowledge of Hadoop, Hive & GitHub version management
  • Knowledge of Cloud Computing (GCP), Data Modelling, exposure to Data Quality assessment and control
  • Exposure to working on data pertaining to banking and financial services domain
  • Highly adaptable in quickly changing technical environments with strong organizational and analytical skills