Tech Stack
AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformGrafanaJavaKafkaPandasPySparkPythonSparkSQLTableauTCP/IP
About the role
- Serve as a core contributor specializing in JVM-based frameworks, owning and maintaining critical parts of ClickHouse's Data engineering ecosystem.
- Own the full lifecycle of data framework integrations—from the core database driver that handles extremely high throughput, to SDKs and connectors that make ClickHouse feel native in JVM-based applications.
- Craft tools that enable Data Engineers to harness ClickHouse's speed and scale; build foundations relied on by thousands of engineers.
- Collaborate closely with the open-source community, internal teams, and enterprise users to ensure JVM integrations set the standard for performance, reliability, and developer experience.
- Directly impact systems processing massive datasets, including real-time analytics platforms and observability systems; optimize for throughput and reliability.
Requirements
- 6+ years of software development experience focusing on building and delivering high-quality, data-intensive solutions.
- Proven experience with the internals of at least one of the following technologies: Apache Spark, Apache Flink, Kafka Connect, or Apache Beam.
- Experience developing or extending connectors, sinks, or sources for at least one big data processing framework such as Apache Spark, Flink, Beam, or Kafka Connect.
- Strong understanding of database fundamentals: SQL, data modeling, query optimization, and familiarity with OLAP/analytical databases.
- A track record of building scalable data integration systems (beyond simple ETL jobs).
- Strong proficiency in Java and the JVM ecosystem, including deep knowledge of memory management, garbage collection tuning, and performance profiling.
- Solid experience with concurrent programming in Java, including threads, executors, and reactive or asynchronous patterns.
- Outstanding written and verbal communication skills to collaborate effectively within the team and across engineering functions.
- Understanding of JDBC, network protocols (TCP/IP, HTTP), and techniques for optimizing data throughput over the wire.
- Passion for open-source development.
- Bonus: Prior contributions to open-source projects, Familiarity with ClickHouse or similar high-performance data platforms, Working knowledge of Python (Pandas, PySpark, Airflow).