Tech Stack
AzureCassandraCloudHBaseKafkaMavenMongoDBOraclePythonScalaSparkSQLVault
About the role
- Lead delivery of data on data and analytics platform; design and implement cloud-native data engineering solutions
- Design, build and implement batch and real-time data pipelines with automated, repeatable delivery aligned to enterprise data governance standards
- Develop and propose design patterns that address access and query patterns, data consumption and internal architecture standards
- Collaborate with stakeholders across the business, data scientists and IT to refine data requirements and support analytics initiatives
- Increase speed of data onboarding to the Data and Analytics platform
- Build robust data pipelines to enable larger data consumption
- Improve data pipeline development quality through DevSecOps practices
Requirements
- Programming experience in Spark using modern languages such as Python, Scala
- Experience working with modern data architectures like Azure Data Lake Storage, Azure Databricks, Azure Synapse (formerly SQL Data Warehouse) and Delta Lakes
- Experience leading Data Engineering principles within an organization/team
- Experience working with Integration patterns and technologies such as Azure Event Hub, Function App and C#
- Knowledge and expertise of database modeling techniques: Data Vaults, Star, Snowflake, 3NF
- Experience working with streaming data architecture and technologies for real-time: Spark Streaming, Kafka, Flink, Storm
- Experience working with relational and non-relational database technologies: SQL Server, Oracle, Cassandra, MongoDB, CosmosDB, HBase
- Experience working with source code and configuration management environments such as Azure DevOps, Git, Maven, Nexus
- 2 to 3 projects developing Data Vault