Tech Stack
AirflowAWSCloudDockerETLGoogle Cloud PlatformJavaKafkaKubernetesPythonSparkSQLTableau
About the role
- Drive, develop and maintain ETL processes to integrate data from various sources into a centralized data warehouse
- Optimize complex data pipelines
- Implement data quality checks and validations to ensure data accuracy and completeness
- Proactively monitor data infrastructure, troubleshoot performance issues, and collaborate with cross-functional teams to resolve issues promptly.
- Collaborate with data analysts to understand their data needs and provide guidance on data structure and quality
- Work with business stakeholders to understand their data requirements and translate those into technical specifications
- Identify and resolve data quality issues, and work with the appropriate teams to implement solutions
- Work closely with the data engineering team to optimize data pipeline performance and scalability
- Provide mentoring and code review to peers
Requirements
- Bachelor's degree in Computer Science, Information Technology, Data Science or a related field
- 5+ years technical engineering experience building data processing applications (batch and streaming) with coding in languages including, but not limited to, Python, Java, Spark, SQL.
- Experience with ETL processes, data modeling, and data warehousing concepts
- Comfortable manipulating large datasets.
- Strong analytical and problem-solving skills
- Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams
- Experience with cloud-based data storage and computing platforms (AWS, Snowflake or GCP)
- Experience containerize deployments with docker
- Familiarity with data privacy and security best practices
- Experience with BI and data visualization tools such as Tableau or Looker
- Proven track record in scaling and maintaining data components in a production environment.
- Growth Mindset: Adaptable and open to feedback, thriving in a fast-paced environment with frequent iterations and changes.
- Experience with streaming and other data related technologies, such as Kafka, Kinesis, Flink, Spark and orchestrations tools like Airflow or Dagster
- NICE TO HAVES:
- Experience with data governance frameworks and policies
- Experience with data quality tools like Great Expectations and Soda and data workflow tools like dbt
- Familiarity container orchestration systems, such as Kubernetes and GitOps tools like ArgoCDIf you have a passion for data engineering and are excited to work on challenging problems in the travel industry, we want to hear from you! Join our team of experienced data scientists and engineers to deliver high-quality, tested products quickly and regularly.