Citi

Vice President, Data Engineering

Citi

full-time

Posted on:

Origin:  • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

ApacheAWSAzureCloudDockerGoogle Cloud PlatformHadoopHDFSJavaKafkaKubernetesMapReduceOpen SourcePySparkPythonScalaSpark

About the role

  • Responsible for design and development of big data solutions
  • Partner with domain experts, product managers, analyst, and data scientists to develop Big Data pipelines in Hadoop or Snowflake
  • Responsible for delivering data as a service framework
  • Responsible for moving all legacy workloads to cloud platform
  • Work with data scientist to build Client pipelines using heterogeneous sources and provide engineering services for data science applications
  • Ensure automation through CI/CD across platforms both in cloud and on-premises
  • Ability to research and assess open source technologies and components to recommend and integrate into the design and implementation
  • Be the technical expert and mentor other team members on Big Data and Cloud Tech stacks
  • Define needs around maintainability, testability, performance, security, quality and usability for data platform
  • Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes
  • Convert SAS based pipelines into languages like PySpark, Scala to execute on Hadoop and non-Hadoop ecosystems
  • Tune Big data applications on Hadoop and non-Hadoop platforms for optimal performance
  • Evaluate new IT developments and evolving business requirements and recommend appropriate systems alternatives and/or enhancements to current systems
  • Produces detailed analysis of issues and recommends actions
  • Supervise day-to-day staff management issues, including resource management, work allocation, mentoring/coaching
  • Appropriately assess risk when business decisions are made and drive compliance with applicable laws, rules and regulations

Requirements

  • 10+ years of total IT experience
  • 8+ years of experience with Hadoop (Cloudera)/big data technologies
  • Advanced knowledge of the Hadoop ecosystem and Big Data technologies
  • Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
  • Experience on designing and developing Data Pipelines for Data Ingestion or Transformation using Java or Scala or Python
  • Experience with Spark programming (pyspark or scala or java)
  • Expert level building pipelines using Apache Spark
  • Familiarity with core provider services from AWS, Azure or GCP, preferably having supported deployments on one or more of these platforms
  • Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required
  • Experience with containerization and related technologies (e.g. Docker, Kubernetes)
  • Experience with all aspects of DevOps (source control, continuous integration, deployments, etc.)
  • 1 year Hadoop administration experience preferred
  • 1+ year of SAS experience preferred
  • Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus
  • System level understanding - Data structures, algorithms, distributed storage & compute
  • Possess team management experience and have led a team of data engineers and analysts
  • Experience in Snowflake or Delta lake is a plus
  • Bachelor’s/University degree or equivalent experience, potentially Masters degree
Made Tech

Lead Data Engineer

Made Tech
Seniorfull-time$80k–$95k / year🇬🇧 United Kingdom
Posted: 8 days agoSource: made-tech.pinpointhq.com
ApacheAWSAzureCloudDockerGoogle Cloud PlatformHadoopSpark
Revinate

Data Integration Manager

Revinate
Senior · Leadfull-time$170k–$180k / year🇺🇸 United States
Posted: 2 days agoSource: jobs.lever.co
ApacheAWSAzureCloudDockerETLGoogle Cloud PlatformHadoopJavaKafkaKubernetesMySQL+4 more
Confluent

Senior Customer Success Technical Architect, French Speaker

Confluent
Seniorfull-time$182k–$214k / year🇺🇸 United States
Posted: 7 days agoSource: jobs.ashbyhq.com
ApacheAWSAzureCassandraCloudDistributed SystemsGoGoogle Cloud PlatformHadoopJavaKafkaLinux+3 more
Confluent

Distributed Systems Software Engineer

Confluent
Senior · Leadfull-time$231k–$271k / yearMassachusetts, New York, North Carolina, South Carolina · 🇺🇸 United States
Posted: 16 days agoSource: jobs.ashbyhq.com
ApacheAWSAzureCloudDistributed SystemsGoGoogle Cloud PlatformJavaJavaScriptKafkaMicroservicesPython+1 more
Veepee

Data Engineer

Veepee
Mid · Seniorfull-time🇫🇷 France
Posted: 2 days agoSource: jobs.lever.co
ApacheAWSAzureBigQueryCloudGoogle Cloud PlatformJavaKafkaKubernetesMicroservicesPythonSpark+1 more