Tech Stack
ApacheAWSPySparkPythonSparkSQL
About the role
- Ambush is a People Company, a remote consulting firm focused on integrating remote talent into US product and development teams.
- Design, build, and maintain high-volume data ingestion pipelines to process merchant transactions.
- Implement robust measures to guarantee data quality, reliability, and optimal pipeline performance.
- Share knowledge with the engineering team and foster a culture of agile, data-driven decision-making.
- Manage tasks and time independently without micromanagement.
- Communicate daily in English across cross-functional teams.
Requirements
- Professional experience as a Data Engineer specializing in designing and implementing solutions for high-volume, real-time, or batch data ingestion.
- Experience with Python for building robust data pipelines.
- Extensive experience with Apache Spark (Spark SQL, PySpark) to optimize and accelerate complex data processing jobs.
- Strong understanding of the AWS ecosystem, with a focus on core data services like S3 and Lambdas.
- Familiarity with AWS Glue as the primary engine for performing data processing and complex transformations.
- Ability to act as a coach and leader, focusing on teaching and advocating for good engineering and collaboration practices.
- Experience with Snowflake is a plus.
- Excellent English communication skills.