Tech Stack
Amazon RedshiftAWSCloudETLGoPythonSQLWeb3
About the role
- Architect and maintain robust, scalable, and secure data infrastructure on AWS leveraging Databricks
- Design, develop, and maintain data pipelines using Airbyte and custom-built services in Go to automate data ingestion and ETL
- Oversee creation and maintenance of the data lake, ensuring efficient storage, high data quality, partitioning, performance, monitoring and alerting
- Integrate tools like Airbyte with various data sources and build custom connectors in Go where necessary
- Optimize data pipelines and data lake storage for performance, scalability, low latency and high availability
- Implement data governance, security, access control, encryption, and monitoring best practices in AWS and Databricks
- Collaborate with platform engineers, data analysts and stakeholders and document infrastructure, processes, and best practices
Requirements
- 3+ years of experience as a Data Engineer, with a focus on data lake architecture and ETL pipeline development
- Strong experience with Databricks and AWS services including S3, Glue, Lambda, Redshift, and IAM
- Hands-on experience with Airbyte or similar ETL tools for data ingestion and transformation
- Experience writing services and connectors in Python and/or Go for data pipeline automation
- Solid understanding of data modeling, SQL, and database concepts
- Experience implementing security and governance best practices in cloud environments
- Right to work for the country you are based (company unable to sponsor visas)