The data engineering tech lead will work together with the Cards data engineering team to create data pipelines for ingestion and delivery of card-domain data into Santander Brazil's Corporate Data Lake.
The person will work with an agile team on a strategic project and must have knowledge of Databricks and PySpark.
Responsible for mastering and driving institutional adoption of concepts, tools and technologies related to Data Analysis, and for structuring and technically updating the current Data Analysis Guidelines and Policies.
Responsible for defining principles to be applied to the control, administration and specification of various data methods and models.
Responsible for defining the technologies, techniques and standards to be followed in the company's Data Analysis area.
Supporting the design, structuring and optimization of databases, as well as performing other related duties inherent to the role.
Requirements
Databricks skills: Experience working with Apache Spark on Databricks, including building and optimizing data pipelines.
Experience in PySpark, Python and Kedro: Strong programming skills in PySpark, Python and Kedro to develop, debug and maintain data transformation code.
Batch and streaming data processing: Knowledge of batch and streaming (messaging) data processing, with the ability to design, implement and maintain data processing pipelines.
DevOps knowledge: Familiarity with using Jenkins for continuous integration and delivery (CI/CD), as well as automation of deployment tasks and pipeline management.
Git: Proficiency with Git for source code version control and effective collaboration in development teams.
Agile methods: Understanding of agile principles and practices, such as Kanban and Scrum, for effective collaboration and project management.
Orchestration (e.g., Control-M or others): Knowledge of process orchestration tools, important for scheduling and controlling workflows.
Microsoft Azure knowledge: Experience with key Microsoft Azure data services, including Azure Databricks, Azure Data Factory and Azure Storage Accounts.
Preferred: Previous experience with the Cloudera platform or other on-premises big data solutions, including Hadoop, HBase and Hive.
Familiarity with Java is very helpful (not required to write code, but to read/interpret code).
AZ-900 (Microsoft Azure Fundamentals) and DP-900 (Microsoft Azure Data Fundamentals) certifications are preferred.
Benefits
Bradesco Health Plan (30% co-payment)
Bradesco Dental Plan (no employee contribution)
Life Insurance
Wellhub (Gympass)
Daycare Allowance
Allowance for Children with Special Needs
Payroll-deductible Loan
Private Pension Plan
Pet care benefits
SESC benefits (social and recreational services)
Conexa Telemedicine
Expense Allowance
Meal/Food Allowance
Multi-benefit Card
Medical plan upgrade
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
DatabricksPySparkPythonKedrobatch data processingstreaming data processingDevOpsJenkinsGitorchestration