Tech Stack
Amazon RedshiftAWSCloudETLHadoopJavaKafkaOraclePythonScalaSparkSQLTerraform
About the role
- Design and define data architecture blueprints, data flow diagrams, system integrations, and storage solutions
- Lead development of cloud-based data platform on AWS using services like AWS Glue, Lambda, Redshift, and S3
- Implement data pipelines and warehouses to support AI, automation, analytics & reporting
- Oversee ingestion and optimization of large-scale datasets and migrate legacy Hadoop-based data into AWS
- Develop and maintain conceptual, logical, and physical data models for databases and data lakes
- Enable data science and analytics teams by creating pipelines for ML models and handling streaming data
- Optimize ETL/Spark jobs, implement incremental loading strategies, and ensure scalability and reliability
- Embed data governance, implement data cataloging, lineage tracking, access controls, encryption, and retention policies
- Collaborate with technology leadership, product managers, business analysts and engineering leads
- Provide technical leadership to data engineers, conduct design/code reviews, and lead architecture review sessions
- Evaluate new technologies, pilot proof-of-concepts, and provide thought leadership on data strategy
Requirements
- 10+ years (preferred 15+ years) of experience in data architecture, data engineering, or related fields
- Demonstrated experience leading data-centric projects from concept to production
- Hands-on expertise with AWS data services – especially AWS Glue, Lambda, Redshift, and S3
- Strong experience with big data technologies including Hadoop and Spark
- Exceptional skills in data modeling and database design
- Deep understanding of SQL and proficiency in writing and tuning complex queries
- Proficiency in programming for data engineering – Python (or Scala/Java) for ETL/ELT scripting
- Experience with infrastructure-as-code (Terraform/CloudFormation) and CI/CD pipelines is a plus
- Knowledge of machine learning concepts and experience supporting data science teams
- Experience with real-time data streaming and processing (Kinesis, Kafka, or similar) is a plus
- Excellent communication skills and experience collaborating in cross-functional teams
- Bachelor’s degree in Computer Science, Information Systems, or a related field required
- (Preferred) AWS Certified Solutions Architect or AWS Certified Data Analytics certification
- Applicants must have work authorization that does not now or in the future require sponsorship of a visa