Tech Stack
Amazon RedshiftAWSBigQueryCloudETLPythonSQLTableau
About the role
- Drive quality at every stage of the data pipeline development lifecycle from requirements to release
- Author and conduct tests for data pipeline requirements and data visualization outputs
- Document data flow, transformations, and quality control processes throughout the data engineering lifecycle
- Generate comprehensive reports on data quality metrics, test results, and anomalies detected
- Write and maintain automated test scripts for data validation, transformation testing, and output verification
- Identify and proactively seek out opportunities to augment data quality frameworks and capabilities
- Collaborate with data engineering, data science, and business stakeholders to drive comprehensive requirements
- Analyze end user requirements to create and maintain comprehensive test plans for data products
- Execute manual and automated test cases, debug data quality issues, and follow up to ensure resolution
- Identify, reproduce, and report data defects and verify fixes across the data ecosystem
- Participate in agile development process and contribute to continuous improvement of data quality practices
- Represent the data consumers to the data engineers to ensure that requirements are met
- Detect and propose improvements in both the data quality control process and the data engineering lifecycle
Requirements
- 5+ years testing, building, or supporting data pipelines, data warehouses, or business intelligence solutions
- Experience testing data APIs, ETL processes, and data visualization outputs
- Experience with using ticketing systems such as Jira, aha!, etc.
- Experience with test case management software for data quality initiatives
- Strong analytical skills with the ability to identify patterns in data quality issues
- Experience with SQL, database technologies, and testing concepts
- Knowledge of data validation techniques for both structured and unstructured data
- Excellent written and verbal communication skills
- Demonstrated leadership ability to build, lead and inspire data quality initiatives
- Familiarity with automated testing for data pipelines and data visualization outputs
- Experience with API testing frameworks (Postman, Rest Assured, etc.) for data services
- Experience testing or supporting data warehouses, data lakes, or business intelligence platforms
- Experience writing code using Python, SQL, or other languages common in data engineering
- Desire and ability to take on special engineering initiatives that have an impact on data quality
- Preferred: Experience with data observability tools and practices
- Preferred: Experience with AWS
- Preferred: Knowledge of data governance principles and data quality standards
- Preferred: Experience with testing frameworks like dbt, Great Expectations, etc
- Preferred: Experience working with cloud data platforms (Snowflake, BigQuery, Redshift, etc.)
- Preferred: Experience with data visualization testing (Sigma, Tableau, Power BI, Looker)
- Preferred: Understanding of AI/ML workflows and the importance of data quality in these processes