Tech Stack
AirflowApacheBigQueryCloudGoogle Cloud PlatformKubernetesPythonSparkSQL
About the role
- Work in a cross-functional product team to design and implement data centered features for Europe’s largest Ad Network
- Help to scale data stores, data pipelines and ETLs handling terabytes for one of the largest retail companies
- Design and implement efficient data processing workflows
- Extend reporting platform for external customers and internal stakeholders to measure advertising performance
- Continue to develop custom data processing pipeline and continuously improve the technology stack
- Collaborate with machine learning engineers and software engineers to build and integrate fully automated and scalable reporting, targeting and ML solutions
- Participate in company and engineering specific onsite events despite remote setup
Requirements
- 3+ years of professional experience working on data-intensive applications
- Fluency with Python and good knowledge of SQL
- Experience with developing scalable data pipelines with Apache Spark
- Good understanding of efficient algorithms and ability to analyze them
- Curiosity about how databases and other data processing tools work internally
- Familiarity with git
- Ability to write testable and maintainable code that scales
- Excellent communication skills and a team-player attitude
- Great if you also have: Experience with Kubernetes
- Great if you also have: Experience with Google Cloud Platform
- Great if you also have: Experience with Snowflake, Big Query, Databricks and DataProc
- Great if you also have: Knowledge of columnar databases and file formats like Apache Parquet
- Great if you also have: Knowledge of Delta Lake and other Big Data technologies
- Great if you also have: Experience with workflow management solutions like Apache Airflow
- Great if you also have: Affinity for Data Science tasks to prototype Reporting and ML solutions
- Great if you also have: Knowledge of Dataflow / Apache Beam