Work alongside the Cards Data Engineering team to create data pipelines for ingesting and provisioning card-domain data into Santander Brazil’s Corporate Data Lake.
Work within an agile team on a strategic area project and must have experience with Databricks and PySpark.
Requirements
Databricks skills: Experience working with Apache Spark on Databricks, including building and optimizing data pipelines.
Experience in PySpark, Python and Kedro: Strong programming skills in PySpark and Python and experience with Kedro to develop, debug and maintain data transformation code.
Batch and streaming data processing: Knowledge of batch and streaming (messaging) data processing, with the ability to design, implement and maintain processing pipelines.
DevOps knowledge: Familiarity with using Jenkins for continuous integration and continuous delivery (CI/CD), as well as automating deployment tasks and managing pipelines.
Git: Proficiency in Git for source code version control and effective collaboration within development teams.
Agile methodologies: Understanding of agile principles and practices such as Kanban and Scrum for effective collaboration and project management.
Orchestration (e.g., Control-M or others): Knowledge of workflow orchestration tools, important for scheduling and controlling workflows.
Microsoft Azure knowledge: Experience with key Microsoft Azure data services, including Azure Databricks, Azure Data Factory and Azure Storage Accounts.
AWS knowledge: Experience with key services such as Aurora PostgreSQL, CloudWatch, Lambda and S3.
Experience with on-premises environments (Cloudera): Desirable previous experience with Cloudera or other on-premises big data solutions, including Hadoop, HBase and Hive.
Object-oriented development knowledge: Familiarity with Java is very helpful (not required to code; ability to read/interpret code is expected).
Optional certifications: AZ-900 (Microsoft Azure Fundamentals) and DP-900 (Microsoft Azure Data Fundamentals) are preferred and demonstrate solid knowledge of the Azure platform and data fundamentals.
Benefits
Bradesco Health Plan (30% copayment);
Bradesco Dental Plan (no employee contribution);
Life insurance;
Wellhub (Gympass);
Childcare allowance;
Allowance for dependents with special needs;
Payroll-deductible loan;
Private pension plan;
Pet benefits;
SESC benefits;
Conexa telemedicine.
Expense allowance;
Meal / Food vouchers;
Multi-benefits card;
Medical plan upgrade option.
We are a company committed to social responsibility: extended maternity and paternity leave;
INMaterna program: support program for pregnant employees;
Newborn welcome kit and the book "It Happened When I Was Born";
Professional development: courses available through our internal university;
100% remote or hybrid work, as applicable to the project.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
DatabricksPySparkPythonKedrobatch data processingstreaming data processingDevOpsGitworkflow orchestrationobject-oriented development