Tech Stack
AWSAzureCloudETLGoogle Cloud PlatformPySparkPythonSQL
About the role
- The Python Developer will report to the Lead Data Engineer and work closely with data developers and engineers within the data team.
- The core focus of the role will be on Python/PySpark development for ETL processes, leveraging SQL to query the lakehouse and support analytics.
- The role also involves using GitHub for version control, and working with cloud platforms such as GCP, Azure, and AWS to design, build, and manage data pipelines, orchestrations, and other internal services.
- Roles & Responsibilities:
- Develop and maintain data engineering solutions focused on ETL and data transformation.
- Build, optimize, and manage scalable pipelines in cloud environments.
- Support analytics teams by delivering reliable and well-structured data.
- Collaborate with team members to ensure smooth integration and delivery of data services.
Requirements
- Proficient in Python for data engineering and transformation.
- Strong hands-on experience with PySpark and Databricks for big data processing.
- Skilled in SQL for querying and managing data in lakehouse and relational systems.
- Familiarity with YAML for cluster and workflow configuration.
- Experience with GitHub for version control and collaborative coding.
- Proficient with cloud platforms (AWS, Azure, GCP) for pipeline orchestration, data storage, and service management.
- Exposure to internally built orchestration tools for ETL workflows.
- Strong analytical and problem-solving skills with a focus on data quality.
- Effective team player with clear communication across technical and non-technical teams.
- Quick learner, adaptable to evolving tools and practices in data engineering.
- Detail-oriented with the ability to work under tight deadlines.
- Proactive and collaborative mindset in delivering high-quality solutions.
- Working with Agile mentality and framework