InfraCloud Technologies

Data Engineer

InfraCloud Technologies

full-time

Posted on:

Origin:  • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformIoTKafkaMongoDBPySparkPythonScalaSparkSQL

About the role

  • Develop ETL/ELT pipelines in Databricks (PySpark notebooks) or Snowflake (SQL/Snowpark), ingesting from sources like Confluent Kafka
  • Handle data storage optimizations using Delta Lake/Iceberg formats, ensuring reliability (e.g., time travel for auditing in fintech pipelines)
  • Integrate with Azure ecosystems (e.g., Fabric for warehousing, Event Hubs for streaming) and support BI/ML teams (prepare features for demand forecasting models)
  • Contribute to real-world use cases such as dashboards for healthcare outcomes or optimizing logistics routes with aggregated IoT data
  • Write clean, maintainable code in Python or Scala
  • Collaborate with analysts, engineers, and product teams to translate data needs into scalable solutions
  • Ensure data quality, reliability, and observability across the pipelines

Requirements

  • 3–6 years of hands-on experience in data engineering
  • Experience with Databricks / Apache Spark for large-scale data processing
  • Familiarity with Kafka, Kafka Connect, and streaming data use cases
  • Proficiency in Snowflake — including ELT design, performance tuning, and query optimization
  • Exposure to MongoDB and working with flexible document-based schemas
  • Strong programming skills in Python or Scala
  • Comfort with CI/CD pipelines, data testing, and monitoring tools
  • Good to have: Experience with Airflow, dbt, or similar orchestration tools
  • Good to have: Worked on cloud-native stacks (AWS, GCP, or Azure)
  • Good to have: Contributed to data governance and access control practices