Tech Stack
AirflowGoJavaOpen SourcePythonRustSQL
About the role
- Evolve existing source data pipelines to an ELT model of data ingestion.
- Cleanly separate source-aligned data products from aggregate data products.
- Richly decorate your data products with metadata to support knowledge transfer, adoption, and the application of Machine Learning.
- Tag and classify your data assets to ensure they are used responsibly throughout the organization, applying masking or restricting access where appropriate.
- Apply software engineering best practices to your code release process to support CI/CD and a high velocity collaboration model based on InnerSource.
- Register and maintain the catalog entries for your data products to support discoverability and reuse.
- Ensure your data products easily join with other business data products using common identifiers and keys.
- Develop automated and resilient processes that support the end-end delivery of business value.
- Publish and deliver on a data product SLO for your data ‘customers’.
- Responsibly share your data products with other internal consumers while balancing the core needs of security and compliance.
- Contribute feedback and recommendations to the Data Platform team in order to remove friction and increase scale for all users.
- Write custom adapters to integrate internal data sources into the centralized Warehouse environment.
- Design and build a Model Context Protocol, a low-latency service that standardizes and serves real-time data to power AI models.
- Engineer resilient pipelines from feature stores and databases to accelerate model deployment and enhance prediction accuracy.
Requirements
- Bachelor's degree in Computer Science, Computer Engineering, or related field.
- 4+ years of software development experience with a focus on data applications & systems.
- Exceptional software and data engineering skills that lead to elegant and maintainable data products.
- Expert level proficiency in using SQL for data transformation.
- Proficiency in at least one general purpose programming language, eg. Python, Go, Java, Rust, etc.
- Strong opinions and perspectives that you kindly debate, defend, or change to ensure that the entire team moves as one.
- Sets and resets the bar on all things quality, from code through to data, and everything in between.
- Deep empathy for your users of your data products, leading to a constant focus on removing friction, increasing adoption, and delivering business results.
- Prunes and prioritizes work in order to maximize your contributions and impact.
- Bias for action and leading by example.
- Past experience in building enterprise data products that have a high level of governance and compliance requirements.
- Communication skills and experience in interacting with cross functional business and engineering teams.
- Capability in undertaking business needs analysis in direct consultation.
- The following are considered as a plus: Familiarity with open source or inner source development and processes; Familiarity of data mesh architectural principles; Experience with Snowflake, Fivetran, dbt, Airflow / Astronomer.