Tech Stack
AirflowAWSCloudETLJenkinsOpen SourcePythonSOAPSparkSQLTerraform
About the role
- - Build core data architecture components from scratch as part of enterprise data architecture enhancements
- - Analyze and optimize the existing Data architecture using best practices
- - Identify, analyze and integrate various data source systems being used across organization
- - Ensure existing data architecture works optimally and undertake regular maintenance tasks
Requirements
- - Vast experience in a Data Engineer role.
- - Experience in developing Enterprise data architecture using latest Big Data technologies preferably Open Source
- - Experienced in AWS services (S3, Glue, Lambda, SQS, SNS, Eventbridge, Athena, AppFlow, etc.) and familiarity with cloud-native architecture.
- - Experience of designing complex workflows on workflow management tools like Airflow, AWS Data pipeline, etc.
- - Strong experience in Terraform for infrastructure automation and cloud management.
- - Experience in building scalable ETL pipelines, transforming and integrating data from various sources like SaaS into centralized storage solutions with tools like FiveTran, AWS AppFlow, Xplenty and etc.
- - Experience of developing customized integration layer in Python through REST and SOAP API’s architectures (nice to have)
- - Strong technical abilities to understand, design, write and debug complex code in Python, Spark and SQL is a must
- - Experience with DBT to manage data transformation workflows within the Data Lakehouse environment
- - Experience with CI/CD tools like Buildkite and Jenkins.
- - Understanding of data modeling, data governance and security best practices in cloud environments.
- - Strong analytical and troubleshooting skills to identify bottlenecks, optimize data flows, and ensure system scalability and performance.
- - Implement MLOps solutions that deploy machine learning models into production efficiently (plus)