Explore, analyze, and onboard data sets from data producers to ensure they are ready for processing and consumption.
Develop and maintain scalable and efficient data pipelines for data collection, processing (quality checks, de-duplication, etc.), and integration into Data lake and Data warehouse systems.
Optimize and monitor data pipeline performance to ensure minimal downtime.
Implement data quality control mechanisms to maintain data set integrity.
Collaborate with stakeholders for seamless data flow and address issues or needs for improvement.
Manage the deployment and automation of pipelines and infrastructure using Terraform, Flyte, and Kubernetes.
Support strategic data analysis and operational tasks as needed.
Lead end-to-end data pipeline development — from initial data discovery and ingestion to transformation, modeling, and delivery into production-grade data platforms.
Integrate and manage data from 3+ distinct sources, designing efficient, reusable frameworks for multi-source data processing and harmonization.
Requirements
Demonstrated ability to navigate ambiguous data challenges, ask the right questions, and design effective, scalable solutions.
Proficient in designing, building, and maintaining large-scale, reliable data pipeline systems.
Competence in designing and handling large-scale data pipeline systems.
Advanced SQL skills for querying and processing data.
Proficiency in Python, with experience in Spark for data processing.
3+ years of experience in data engineering, including data modeling and ETL pipelines.
Familiarity with cloud-based tools and infrastructure management using Terraform and Kubernetes is a plus.
Benefits
💻 Remote work
✈️ Travel expected 2-3 times per year for company-sponsored events
🩺 Medical, dental, vision, life, and disability insurance
📈 401K retirement plan; flexible spending and health savings account
🏝️ 15 days of paid time off + additional front-loaded personal days
🏖️ 14 company-recognized holidays + paid volunteer days
👶 up to 8 weeks of paid parental leave + 10 weeks of paid bonding leave
🌈 LGBTQ+ Health Services
🐶 Pet insurance
📣 Check out more of our benefits here: https://www.mcg.com/about/careers/benefits/
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data pipeline developmentdata processingdata modelingETL pipelinesSQLPythonSparkdata quality controldata integrationdata analysis