Tech Stack
Amazon RedshiftAWSCloudDynamoDBETLJenkinsMySQLNode.jsPostgresPySparkPythonSQLThrift
About the role
- Design, build, and maintain batch and real-time data pipelines integrating 1st, 2nd, and 3rd party data sources
- Lead the design and evolution of BI Delta Lake infrastructure to support analytics and reporting
- Develop data catalogs, validation routines, error logging, and monitoring solutions for high-quality datasets
- Build integrations with marketing, media, and subscription platforms to optimize KPIs
- Partner with Data Architect to enable attribution, segmentation, and activation capabilities across business teams
- Collaborate with product, lifecycle, and marketing teams to democratize insights and improve engagement through data-driven solutions
- Coach engineers and BI team members on best practices for building large-scale and governed data platforms
Requirements
- 4+ years of experience in big data and/or data intensive projects
- 4+ years of hands-on Python development
- Expert-level SQL development (Redshift, PostgreSQL, MySQL, etc.)
- Strong experience with PySpark
- Experience with AWS services: Redshift, S3, DynamoDB, SageMaker, Athena, Lambda
- Experience with Databricks, Snowflake, Jenkins
- Strong knowledge of data engineering practices: data pipelines, ETL, data governance, metadata management, data lineage
- Experience with APIs, data wrangling, and advanced data transformations
- Bachelor's degree in STEM field (required)
- Work Authorization: GC, USC, All valid EADs except OPT, CPT, H1B
- No C2C or 1099 or sub-contract