Tech Stack
AirflowAWSAzureCloudETLGoogle Cloud PlatformNoSQLPythonSparkSQL
About the role
- Design, implement, test, deploy, and maintain production-grade data products—mainly focused on data pipelines, transformation layers, and real-time data systems.
- Own the full lifecycle of data product development—from concept and design to deployment and maintenance.
- Develop and maintain production ETL/ELT pipelines using DBT (on Spark) and orchestrated workflows in Databricks.
- Apply best practices in Python and SQL to create scalable, maintainable, and efficient data transformations.
- Build monitoring, alerting, and testing pipelines to ensure reliability and performance in production.
- Utilize LLMs and Generative AI technologies to enhance data workflows and feature engineering.
- Collaborated with leading industry data providers to assess and integrate third-party data assets, enhancing the quality and performance of Explorium’s data products.
Requirements
- 4+ years of experience in production-level data engineering, data product development, or related roles.
- Deep proficiency in SQL, Python, and working with large-scale data processing systems.
- Proven track record of owning and scaling production-grade data pipelines, including versioning, testing, and monitoring.
- Strong understanding of data modeling, normalization/denormalization trade-offs, and data quality management.
- Experience working with modern data stack tools: DBT, Databricks, Spark, Airflow, Delta Lake, etc.
- Strong analytical and experimentation skills, including the ability to design and evaluate data-driven hypotheses and KPIs.
- Hands-on experience with DBT (nice-to-have).
- Hands-on experience with Databricks or similar data lakehouse platforms (nice-to-have).
- Familiarity with data modeling techniques and working with both SQL and NoSQL databases (nice-to-have).
- Familiarity with GenAI and LLM applications—particularly in extracting structure from unstructured data at scale (nice-to-have).
- Experience working with a wide variety of external data sources and vendors (nice-to-have).
- Familiarity with cloud-native data platforms (e.g., AWS, Azure, or GCP) (nice-to-have).
- BSc/BA in Computer Science, Engineering, or a related technical field—or graduation from a top-tier IDF tech unit.