Tech Stack
Amazon RedshiftAWSCloudNoSQLPythonSQL
About the role
- Develop code used by data tools on Cloud (AWS) to perform expected data aggregation and processing operations
- Support, troubleshoot and fix bugs on existing data pipelines
- Define, build and deliver high-quality data pipelines
- Participate in the Data Validation process
- Cloud infrastructure design and setup
- Design, construct, install, test and maintain data management systems
- Ensure that all systems meet the business/company requirements as well as industry practices
- Integrate up-and-coming data management and software engineering technologies into existing data structures
- Develop set processes for data mining, data modeling, and data production
- Recommend different ways to constantly improve data reliability and quality
- Analyze and organize raw data
- Build data-marts that will be consumed by data scientist’s analysis and data visualization tools
- Collaborate with teams across EU, US, Brazil and other countries
Requirements
- Solid experience in designing, building and testing data pipelines
- Solid experience using Python for data
- Experience with Cloud (especially AWS Glue)
- Good knowledge with Infrastructure as a Code
- Knowledge of Cloud Data Platform (AWS RedShift, AWS EMR, SnowFlake)
- Knowledge of unit, integration and E2E testing
- Knowledge of using python or other scripting languages to automate day to day tasks
- Knowledge on different SQL and No SQL databases
- Advanced and/or fluent proficiency in English (submit CV in English)
- Experience collaborating with teams in EU, US, Brazil and other countries