Tech Stack
Amazon RedshiftApacheAWSCloudETLKafkaPythonSparkSQL
About the role
- Design and guide implementation of robust, scalable, and secure data pipelines powering AI-driven insights for digital investigations
- Own the data architecture vision including data lake, data warehouse, and ETL/ELT pipelines
- Provide architectural leadership for cost efficiency and scalability; identify opportunities to optimize cloud spend
- Collaborate with engineering, product, data science, and security teams to translate product requirements into data architecture
- Conduct deep architecture reviews, assess technical risks, and recommend solutions for data quality, security, and governance
- Define and standardize data modeling practices, governance frameworks, and data pipeline patterns across product teams
- Evaluate new data technologies, streaming platforms, and modern data warehouses to improve platforms
- Document architectural decisions, data flow diagrams, and cost models; mentor and act as a technical authority
Requirements
- Strong background in building and managing SaaS data platforms on AWS (S3, Glue, Redshift, Kinesis, Lambda)
- Deep understanding of data warehousing, data lakes, streaming architectures, and various data modeling techniques
- Proven experience architecting data pipelines for AI/GenAI use cases; knowledge of MLOps, feature stores, and vector databases
- Proven experience in cloud cost optimization for data platforms (storage, compute, data transfer)
- Proficiency in big data technologies (e.g., Apache Spark, Kafka)
- Strong coding background in Python and SQL for data manipulation and pipeline development
- Solid understanding of CI/CD principles for data pipelines and product-centric DevOps culture
- Excellent communication and interpersonal skills; capable of influencing and aligning stakeholders at all levels
- Fluent in English (Hebrew – an advantage)
- Domain expertise in digital investigations, cyber, or public safety – an advantage