Salary
💰 $144,000 - $234,000 per year
Tech Stack
ApacheAWSDistributed SystemsJavaKafkaNoSQLOpen SourceScalaSparkSQL
About the role
- Build data platforms and streaming engines that are real-time and batch in nature
- Optimize existing data platforms and infrastructure while exploring other technologies
- Provide technical leadership to the data engineering team on how data should be stored and processed more efficiently and quickly at scale
- Build and scale stream & batch processing platforms using the latest open-source technologies
- Work with data engineering teams and help with reference implementation for different use cases
- Improve existing tools to deliver value to the users of the platform
- Work with data engineers to create services that can ingest and supply data to and from external sources and ensure data quality and timeliness
Requirements
- 8+ years of software development experience with at least 3+ years of experience on open-source big data technologies
- Knowledge of common design patterns used in Complex Event Processing
- Knowledge in Streaming technologies: Apache Kafka, Kafka Streams, KSQL, Spark, Spark Streaming
- Proficiency in Java, Scala
- Strong hands-on experience in SQL, Hive, Spark SQL, Data Modeling, Schema design
- Experience and deep understanding of traditional, NoSQL and columnar databases
- Experience of building scalable infrastructure to support stream, batch and micro-batch data processing
- Experience utilizing Apache Iceberg as the backbone of a modern lakehouse architecture
- Experience utilizing AWS Glue as a centralized data catalog to register and manage Iceberg tables
- Experience working with Druid/StarRocks/Apache Pinot etc., powering low-latency queries, routine Kafka ingestion, and fast joins across both historical and real-time data