Salary
💰 $138,900 - $195,000 per year
Tech Stack
AirflowAndroidAWSCloudETLPythonSparkSQL
About the role
- Validate ETL logic, business logic, and data quality in Snowflake, Databricks, and other data platforms before code changes are released to production.
- Partner with data engineers to identify potential failure points and proactively help catch issues early.
- Ensure the quality of every release using rigorous, data-driven testing practices.
- Develop automated and reusable tests to improve coverage, development velocity, and reduce regression risk.
- Translate business and technical requirements into test scenarios to validate KPIs, metrics, and business rules.
- Contribute to and enhance the existing test automation framework, with a focus on scalability and maintainability.
- Collaborate closely with Data Analysts, Product Managers, and Engineering teams to ensure accuracy, completeness, and usability of the data.
- Part of product teams in building architectures which are robust, fault-tolerant, and cloud-native.
- Builds solutions for problems of sizeable scope and complexity that have been successfully deployed to customers/users.
- Influences and drives software engineering best practices within the team
- Technically lead and deliver multiple projects utilizing an Agile methodology while reviewing team members code.
- Participates in developing technical and/or business approaches; and new/enhanced technical tools.
- Owns the design of software programs or systems within the team, and within the organization.
- Writes codes that establishes and enhances frameworks.
- Reviews code for the design, testability and clear usability.
- Builds solutions that scale and perform.
- Identifies opportunities to improve the system/product/services with each iteration.
Requirements
- 5+ years of relevant experience
- Strong experience validating data pipelines, ETL processes, and data warehouses in production environments.
- Expert-level SQL skills and hands-on experience with large datasets (terabytes or more), capable of identifying data anomalies through efficient queries.
- Proficiency with Snowflake, Hive, Databricks, and other modern data platforms.
- Solid Python skills and experience with test automation for data pipelines.
- Familiarity with tools like Airflow and Spark and understanding of CI/CD principles.
- Strong collaboration and communication skills; able to work effectively across cross-functional teams.
- B.S. in Computer Science (or equivalent degree or work experience)
- Nice to have: Experience with BDD frameworks (e.g., Behave)
- Nice to have: Experience working in AWS or other cloud environments
- Nice to have: Familiarity with open-source data quality tools like Deequ, Great Expectations, or similar custom frameworks