Value1

Data Engineer, PySpark

Value1

full-time

Posted on:

Location: 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

CloudGoogle Cloud PlatformHadoopPySparkPythonShell ScriptingSQLUnix

About the role

  • Develop and maintain PySpark data pipelines and transformations
  • Perform data cleaning, feature engineering, and build statistical/ML models
  • Write SQL for data analytics and data transformations
  • Work on Unix-based platforms including shell scripting and scheduling cron jobs
  • Work within big data ecosystem tools such as Hadoop and Hive and use GitHub for version control
  • Use GCP for cloud-based data solutions, data modelling, and data quality assessment and control
  • Collaborate on projects involving banking and financial services data and adapt to changing technical environments

Requirements

  • 5+ years of experience in Data Analytics role with at least 2 years of development experience in PySpark
  • Very strong in SQL
  • Expertise in programming with Python with experience in Data Cleaning, Feature Engineering, Transformation and building statistical/ML models
  • Experience working on Unix based platforms with basic knowledge of shell scripting, writing Cron jobs etc.
  • Knowledge of big data ecosystem with knowledge of Hadoop, Hive & GitHub version management
  • Knowledge of Cloud Computing (GCP), Data Modelling, exposure to Data Quality assessment and control
  • Exposure to working on data pertaining to banking and financial services domain
  • Highly adaptable in quickly changing technical environments with strong organizational and analytical skills
Poppulo

SDE 2, Data Engineer

Poppulo
Mid · Seniorfull-time🇮🇳 India
Posted: 1 day agoSource: boards.greenhouse.io
AirflowAmazon RedshiftAWSBigQueryCloudPySparkPythonSparkSQLTypeScript
EXL

Lead Assistant Manager – Data Engineering, Cloud Data Engineering

EXL
Seniorfull-time🇮🇳 India
Posted: 1 day agoSource: fa-ewjt-saasfaprod1.fa.ocs.oraclecloud.com
AirflowAWSAzureCloudETLGoogle Cloud PlatformHadoopHDFSInformaticaPySparkPythonSpark+1 more
Western Digital

Principal Engineer – Enterprise Data Platform, Data Warehouse, Data Modeling, GCP, Azure, AWS

Western Digital
Leadfull-time🇮🇳 India
Posted: 2 days agoSource: jobs.smartrecruiters.com
Amazon RedshiftAWSAzureBigQueryCloudERPETLInformaticaOraclePythonScalaSpark+1 more
Novartis

Business Data Migration Expert – Order to Cash

Novartis
Mid · Seniorfull-time🇮🇳 India
Posted: 2 days agoSource: novartis.wd3.myworkdayjobs.com
ERP
Caterpillar Inc.

Senior Manager, Software Engineering – Data Engineering

Caterpillar Inc.
Seniorfull-time🇮🇳 India
Posted: 3 days agoSource: cat.wd5.myworkdayjobs.com
ApacheAWSAzureCloudETLJenkinsKafkaNoSQLSparkSQL