Tech Stack
ApacheDistributed SystemsHDFSJavaKafkaPythonScalaSparkSQL
About the role
- Lead the design and optimization of data execution engines, data format handling, and query-layer integration for the Infinia Data Engine.
- Implement high-performance indexing (B-epsilon trees, full-text indexing, vectorization) and systems for high-throughput data access and transformation using Parquet, ORC, and Avro.
- Engineer integration layers supporting Trino, Apache Spark, Apache Iceberg, Delta Lake, HDFS, and Hive Metastore to enable open-source client compatibility.
- Build and tune execution plans leveraging Infinia’s high-throughput I/O and compute for large-scale AI and analytics workloads.
- Analyze and optimize distributed query execution, data storage, caching, and memory usage; write automated tests to validate correctness and performance.
- Contribute to open-source ecosystems and collaborate with external projects where appropriate.
- Partner with Data Scientists, Platform Engineers, and Product Managers; provide technical leadership, mentorship, and design direction to the team.
- Participate in an on-call rotation to provide after-hours support as needed.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 12+ years of experience in software development, with 5+ years in distributed systems, data platforms, or big data technologies.
- Expert-level knowledge of SQL, Python, and Java or Scala.
- Experience working with Apache Spark, distributed query engines, or distributed databases.
- Strong familiarity with HDFS, Hive Metastore, and data partitioning strategies.
- Experience with Parquet, ORC, Avro file formats (preferred).
- Hands-on experience with Apache Iceberg and/or Delta Lake (preferred).
- Background in real-time data streaming using tools such as Apache Kafka (preferred).
- Prior experience with C++ (preferred).
- Prior contributions to open-source projects; committer status is a plus (preferred).
- Proven ability to lead complex technical initiatives and mentor junior engineers.
- Ability to participate in an on-call rotation to provide after-hours support as needed.