Salary
💰 $148,000 - $202,000 per year
Tech Stack
AirflowAmazon RedshiftApacheAWSAzureBigQueryCloudETLPythonSparkSQL
About the role
- The Principal Data Engineer will design, build, implement, and maintain data processing pipelines for the extraction, transformation, and loading (ETL) of data from a variety of data sources.
Expected Duties:
Principal Data Engineers will design, build, implement, and maintain data processing pipelines for the extraction, transformation, and loading (ETL) of data from a variety of data sources
Expected to lead the writing of complex SQL queries to support analytics needs
Responsible for developing technical tools and programming that leverage artificial intelligence, machine learning, and big-data techniques to cleanse, organize and transform data and to maintain, defend and update data structures and integrity on an automated basis
Principal Data Engineers will evaluate and recommend tools and technologies for data infrastructure and processing. Collaborate with engineers, data scientists, data analysts, product teams, and other stakeholders to translate business requirements to technical specifications and coded data pipelines
The role will work with tools, languages, data processing frameworks, and databases such as R, Python, SQL, Databricks, Spark, Delta, APIs. Work with structured and unstructured data from a variety of data stores, such as data lakes, relational database management systems, and/or data warehouses
Requirements
- The role will include work on problems of diverse scope where analysis of information requires evaluation of identifiable factors. Work is expected to be done independently through independent judgment.
- Ability to assess unusual circumstances and uses sophisticated analytical and problem-solving techniques to identify the cause
- Ability to enhance relationships and networks with senior internal/external partners who are not familiar with the subject matter often requires persuasion
- Architect and scale our modern data platform to support real-time and batch processing for financial forecasting, risk analytics, and customer insights
- Enforce high standards for data governance, quality, lineage, and compliance
- Partner with stakeholders across engineering, finance, sales, and compliance to translate business requirements into reliable data models and workflows.
- Evaluate emerging technologies and lead POCs that shape the future of our data stack.
- Champion a culture of security, automation, and continuous delivery in all data workflows
Technical Qualifications:
- Deep expertise in Python, SQL, and distributed processing frameworks like Apache Spark, Databricks, Snowflake, Redshift, BigQuery.
- Proven experience with cloud-based data platforms (preferably AWS or Azure).
- Hands-on experience with data orchestration tools (e.g., Airflow, dbt) and data warehouses (e.g., Databricks, Snowflake, Redshift, BigQuery).
- Strong understanding of data security, privacy, and compliance within a financial services context.
- Experience working with structured and semi-structured data (e.g., Delta, JSON, Parquet, Avro) at scale.
- Familiarity with modelling datasets in Salesforce, Netsuite and Anaplan to solve business use cases required.
- Previous experience Democratizing data at scale for the enterprise a huge plus.