Tech Stack
AirflowAWSAzureBigQueryCloudETLGoogle Cloud PlatformPySparkPythonRDBMSSQLTableau
About the role
- Maintain, enhance, and optimize existing data warehouse architecture and ETL pipelines.
- Design and implement scalable ETL/ELT processes ensuring data quality, integrity, and timeliness.
- Monitor and improve pipeline performance, troubleshoot issues, and implement best practices.
- Create and maintain comprehensive documentation for data engineering processes, architecture, and configurations.
- Partner with business teams to gather requirements and translate them into technical solutions.
- Build and maintain PowerBI dashboards and reports that drive business decisions.
- Develop new data models and enhance existing ones to support advanced analytics.
- Transform complex data findings into clear, actionable insights for various departments.
Requirements
- Programming & Query Languages: Strong proficiency in Python, SQL, and PySpark.
- Big Data Platforms: Experience with cloud data platforms including Snowflake, BigQuery, and Databricks. Databricks experience highly preferred.
- Orchestration Tools: Proven experience with workflow orchestration tools (Airflow preferred).
- Cloud Platforms: Experience with AWS (preferred), Azure, or Google Cloud Platform.
- Data Visualization: Proficiency in PowerBI (preferred) or Tableau.
- Database Systems: Familiarity with relational database management systems (RDBMS).
- Version Control: Proficient with Git for code management and collaboration.
- CI/CD: Hands-on experience implementing and maintaining continuous integration/deployment pipelines.
- Documentation: Strong ability to create clear technical documentation.
- Professional Experience: 3+ years in data engineering or closely related roles.
- Language Requirements: Fluent English communication skills for effective collaboration with U.S. based team members.
- Pipeline Expertise: Demonstrated experience building and maintaining production data pipelines.