Data Engineer

InfraCloud Technologies

full-time

Posted on: 9/9/2025

Origin: • 🇮🇳 India

✨ AI Apply

Mid-LevelSenior

AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformIoTKafkaMongoDBPySparkPythonScalaSparkSQL

About the role

Develop ETL/ELT pipelines in Databricks (PySpark notebooks) or Snowflake (SQL/Snowpark), ingesting from sources like Confluent Kafka
Handle data storage optimizations using Delta Lake/Iceberg formats, ensuring reliability (e.g., time travel for auditing in fintech pipelines)
Integrate with Azure ecosystems (e.g., Fabric for warehousing, Event Hubs for streaming) and support BI/ML teams (prepare features for demand forecasting models)
Contribute to real-world use cases such as dashboards for healthcare outcomes or optimizing logistics routes with aggregated IoT data
Write clean, maintainable code in Python or Scala
Collaborate with analysts, engineers, and product teams to translate data needs into scalable solutions
Ensure data quality, reliability, and observability across the pipelines

3–6 years of hands-on experience in data engineering
Experience with Databricks / Apache Spark for large-scale data processing
Familiarity with Kafka, Kafka Connect, and streaming data use cases
Proficiency in Snowflake — including ELT design, performance tuning, and query optimization
Exposure to MongoDB and working with flexible document-based schemas
Strong programming skills in Python or Scala
Comfort with CI/CD pipelines, data testing, and monitoring tools
Good to have: Experience with Airflow, dbt, or similar orchestration tools
Good to have: Worked on cloud-native stacks (AWS, GCP, or Azure)
Good to have: Contributed to data governance and access control practices