Data Engineer, PySpark

Value1

full-time

Posted on: 9/7/2025

Location: 🇮🇳 India

✨ AI Apply

Mid-LevelSenior

CloudGoogle Cloud PlatformHadoopPySparkPythonShell ScriptingSQLUnix

About the role

Develop and maintain PySpark data pipelines and transformations
Perform data cleaning, feature engineering, and build statistical/ML models
Write SQL for data analytics and data transformations
Work on Unix-based platforms including shell scripting and scheduling cron jobs
Work within big data ecosystem tools such as Hadoop and Hive and use GitHub for version control
Use GCP for cloud-based data solutions, data modelling, and data quality assessment and control
Collaborate on projects involving banking and financial services data and adapt to changing technical environments

5+ years of experience in Data Analytics role with at least 2 years of development experience in PySpark
Very strong in SQL
Expertise in programming with Python with experience in Data Cleaning, Feature Engineering, Transformation and building statistical/ML models
Experience working on Unix based platforms with basic knowledge of shell scripting, writing Cron jobs etc.
Knowledge of big data ecosystem with knowledge of Hadoop, Hive & GitHub version management
Knowledge of Cloud Computing (GCP), Data Modelling, exposure to Data Quality assessment and control
Exposure to working on data pertaining to banking and financial services domain
Highly adaptable in quickly changing technical environments with strong organizational and analytical skills