Build and manage reliable data pipelines covering ingestion/collection, processing, integration, storage, and data provisioning across the organization.
Work within a distributed systems architecture for massively parallel data processing (MPP), combining multiple heterogeneous data sources and collaborating with analytics and data science teams to build solutions and deliver data-driven value.
Requirements
Hands-on experience with ingestion, integration, processing, and storage of large volumes of data;
Experience working on Big Data projects;
Behavior Driven Development (BDD).
Data extraction using Python and data processing with PySpark;
Experience with ETL tools;
Knowledge of relational and dimensional data modeling (Data Warehouse);
Experience with SQL databases;
Experience with AWS Big Data toolset such as EMR, Kinesis, Redshift, S3, Glue, Elasticsearch;
Knowledge of Kafka;
Knowledge of Data Lake and DataOps.
AWS certifications are a plus;
Experience with cloud infrastructure-as-code tools such as Terraform and CloudFormation.
Benefits
Swile flexible card for meal and grocery expenses (VA and VR)
Totalpass or Gympass
Mental health support – Psicologia Viva
Bradesco Health Insurance
Bradesco Dental Plan
Profit sharing
Childcare assistance for new mothers
Support for certification costs
Special talks and webinars
RAF referral bonus program
Life insurance
Subsidy for English or Italian lessons
Discount for Open English
Birthday gift
Relocation possibility (international)
Partnerships with universities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data ingestiondata integrationdata processingdata storageBig DataBehavior Driven Developmentdata extractionPySparkETLSQL