Allata

Senior Data Engineer, Snowflake

Allata

full-time

Posted on:

Location Type: Office

Location: Pune • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformHadoopKafkaPySparkPythonSparkSQLTableau

About the role

  • Architect, develop, and maintain scalable, efficient, and fault-tolerant data pipelines using Python and PySpark.
  • Design pipeline workflows for batch and real-time data processing using orchestration tools like Apache Airflow or Azure Data Factory.
  • Implement automated data ingestion frameworks to extract data from structured, semi-structured, and unstructured sources such as APIs, FTP, and data streams.
  • Architect and optimize scalable Data Warehouse and Data Lake solutions using Snowflake, Azure Data Lake, or AWS S3.
  • Implement partitioning, bucketing, and indexing strategies for efficient querying and data storage management.
  • Develop ETL/ELT pipelines using tools like Azure Data Factory or Snowflake to handle complex data transformations and business logic.
  • Integrate DBT to automate data transformations, ensuring modularity and testability.
  • Ensure pipelines are optimized for cost-efficiency and high performance.
  • Write, optimize, and troubleshoot complex SQL queries for data manipulation, aggregation, and reporting.
  • Design and implement dimensional and normalized data models (star and snowflake schemas) for analytics use cases.
  • Deploy and manage data workflows on cloud platforms using services like AWS Glue, Azure Synapse Analytics, or Databricks.
  • Monitor resource usage and costs, implementing cost-saving measures such as data lifecycle management and auto-scaling.
  • Implement data quality frameworks to validate, clean, and enrich datasets.
  • Build self-healing mechanisms to minimize downtime and ensure reliability of critical pipelines.
  • Optimize Spark workflows by tuning executor memory and partitioning.
  • Conduct profiling and debugging of data workflows to identify and resolve bottlenecks.
  • Collaborate with data analysts, scientists, and stakeholders to define requirements and deliver usable datasets.
  • Maintain clear documentation for pipelines, workflows, and architectural decisions.
  • Conduct code reviews to ensure best practices in coding and performance optimization.

Requirements

  • Advanced skills in Python and PySpark for high-performance distributed data processing.
  • Proficient in creating data pipelines with orchestration frameworks like Apache Airflow or Azure Data Factory.
  • Strong experience with Snowflake, SQL Data Warehouse, and Data Lake architectures.
  • Ability to write, optimize, and troubleshoot complex SQL queries and stored procedures.
  • Deep understanding of building and managing ETL/ELT workflows using tools such as DBT, Snowflake, or Azure Data Factory.
  • Hands-on experience with cloud platforms such as Azure or AWS, including services like S3, Lambda, Glue, or Azure Blob Storage.
  • Proficient in designing and implementing data models, including star and snowflake schemas.
  • Familiarity with distributed processing systems and concepts such as Spark, Hadoop, or Databricks.
  • Experience with real-time data processing frameworks such as Kafka or Kinesis.
  • Certifications in Snowflake (good to have).
  • Cloud Certifications (Azure, AWS, GCP) (good to have).
  • Knowledge of data visualization platforms such as Power BI, Tableau, or Looker.
  • Strong teamwork, communication skills and intellectual curiosity.
  • Ability to identify, troubleshoot, and resolve complex data issues effectively.
  • Willingness to embrace new tools, technologies, and methodologies.
  • Innovative thinker with a proactive approach to overcoming challenges.

ATS Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
PythonPySparkSQLETLELTDBTdata modelingdata quality frameworksdata partitioningdata indexing
Soft skills
teamworkcommunicationproblem-solvingintellectual curiosityinnovative thinkingproactive approach
Certifications
Snowflake certificationAzure certificationAWS certificationGCP certification
Dun & Bradstreet

Principal Engineer

Dun & Bradstreet
Leadfull-time🇮🇳 India
Posted: 18 days agoSource: jobs.lever.co
AirflowApacheAWSCloudGoogle Cloud PlatformHadoopNoSQLPySparkPythonSpark
Citi

Senior Vice President, Application Development Group – Data Mart/Big Data/Reporting

Citi
Seniorfull-time$141k–$212k / yearFlorida · 🇺🇸 United States
Posted: 22 days agoSource: citi.wd5.myworkdayjobs.com
AWSAzureCloudGoogle Cloud PlatformHadoopKafkaOraclePostgresSparkSQLTableau
Cummins Inc.

Data Engineer

Cummins Inc.
Mid · Seniorfull-timeTennessee · 🇺🇸 United States
Posted: 18 days agoSource: fa-espx-dev1-saasfaprod1.fa.ocs.oraclecloud.com
CassandraCloudDynamoDBETLHadoopHBaseIoTJavaKafkaMongoDBOpen SourceScala+3 more
GEICO

Senior Staff Engineer - Payments Platform

GEICO
Seniorfull-time$130k–$260k / year🇺🇸 United States
Posted: 3 days agoSource: geico.wd1.myworkdayjobs.com
AirflowCassandraCloudGoJavaKafkaPostgresPythonSparkSQL
Sift

Senior Software Engineer, Payment Protection

Sift
Seniorcontract$35–$40🇺🇦 Ukraine
Posted: 6 days agoSource: jobs.ashbyhq.com
ApacheAWSBigQueryCassandraCloudDistributed SystemsGoogle Cloud PlatformGRPCHBaseJavaKafkaMapReduce+4 more