Affinity Global Inc.

Software Engineer, Database

Affinity Global Inc.

full-time

Posted on:

Origin:  • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

ApacheETLHadoopHBaseHDFSJavaKafkaMapReduceMySQLSparkSQL

About the role

  • Design, develop, and optimize database schemas, tables, indexes, and relationships to ensure efficient data storage and retrieval.
  • Write complex SQL queries, stored procedures, triggers, and functions to support business and application requirements.
  • Gather, clean, and process raw structured and unstructured data from multiple sources (APIs, relational DBs, distributed file systems).
  • Design and implement ETL pipelines for data ingestion, transformation, and storage using MySQL, Hadoop, and Spark.
  • Perform query optimization, indexing, and partitioning to improve database performance.
  • Manage replication, clustering, and failover strategies to ensure high availability.
  • Design and manage large-scale datasets using Hadoop ecosystem components (HDFS, MapReduce, Hive, Impala, Kafka, HBase, Pig).
  • Build and maintain real-time streaming pipelines using Apache Spark and Spark Streaming.
  • Collaborate with cross-functional engineering teams and DevOps to integrate scalable data solutions into production systems and support CI/CD.
  • Take end-to-end responsibility for database lifecycle management (MySQL + Big Data ETL + Analytics).

Requirements

  • 3+ years of SQL (MySQL) experience.
  • 2+ years hands-on experience with Cloudera Hadoop Distribution and Apache Spark.
  • Proficiency in database development (queries, triggers, stored procedures) and knowledge of DB internals.
  • Experience with database administration, performance tuning, replication, backup, and restoration.
  • Comprehensive knowledge of Hadoop Architecture, HDFS, MapReduce, Hive, Impala, Kafka, HBase, Pig, and Java.
  • Experience processing large structured and unstructured datasets.
  • Experience designing and implementing ETL pipelines for data ingestion, transformation, and storage.
  • Experience building and maintaining real-time streaming pipelines using Apache Spark and Spark Streaming.
  • Experience collaborating with DevOps to support CI/CD for database-related deployments.