Tech Stack
AirflowAmazon RedshiftApacheAWSAzureBigQueryDockerETLFlaskJavaJenkinsKafkaKubernetesMySQLNoSQLNumpyPandasPostgresPythonScalaSparkSQLTerraform
About the role
- Design and lead the implementation of scalable, high-performance data architectures (ETL/ELT)
- Define standards and best practices across the organization
- Develop applications, APIs, and pipelines for data ingestion, processing, and consumption
- Implement advanced solutions on AWS and Azure (data warehouses, data lakes, streaming)
- Automate and optimize ingestion, transformation, and loading processes with resilience and observability
- Collaborate with Data Scientists, Analysts, and Product teams to integrate analytics and ML models into production
- Mentor and guide junior and mid-level engineers through code reviews and architecture decisions
- Explore new technologies (Data Mesh, RAG, MLOps, etc.) to drive innovation
Requirements
- 7+ years of experience in Data Engineering or Software Development with a data focus
- Strong programming in Python (Pandas, NumPy, SQLAlchemy) and advanced SQL
- Expertise with Apache Spark, Airflow, and modern orchestration
- Experience with databases (PostgreSQL, MySQL, SQL Server) and data warehouses (Snowflake, BigQuery, Redshift)
- Hands-on experience with AWS or Azure data ecosystems
- Knowledge of DevOps tools: Docker, CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions)
- Bachelor's/Master's in Engineering, Computer Science, or related fields
- Fluent in technical English (reading/writing)
- Nice to have: Scala/Java, Kafka, NoSQL, Graph Databases, Kubernetes, Terraform, time-series DBs, pytest/unittest, API development (FastAPI/Flask)
- Background in Machine Learning and MLOps
- Previous experience in startups, scale-ups, or high-growth environments
- Technical leadership and autonomy; ability to mentor junior and mid-level engineers