Salary
💰 $162,000 - $217,000 per year
Tech Stack
AirflowDistributed SystemsHadoopKafkaPostgresPythonRuby on RailsScalaSparkSQL
About the role
- Design, develop, and sustain comprehensive access controls and governance systems to improve data integrity and privacy.
- Ensure the reliability, scalability, and security of the data platform.
- Collaborate with various stakeholders and actively participate in data infrastructure design and execution.
- Work on a platform that processes petabytes of data daily using technologies like Scala, Python, Postgres, Snowflake, DeltaLake, Iceberg, Clickhouse, Spark, DBT, Flink, Kafka, Airflow.
- Shape data strategy and build the data lake platform from design, technical decisions, project management to execution.
- Partner with Product, Data Science, and ML teams to understand needs and provide high-quality solutions.
Requirements
- Has 5+ years of experience in software engineering.
- Exhibits an in-depth understanding of distributed systems, with proven experience with data processing technologies such as DBT and Airflow, and common web frameworks such as Rails.
- Proficiently uses SQL for writing and reviewing complex queries for data analysis and debugging.
- Can design for scale with the entire system in mind.
- Capably communicates and is comfortable seeking and receiving feedback.
- Possesses strong analytical and debugging skills.
- Takes a strong sense of ownership while working with large codebases and diverse suite of products.
- Embraces a collaborative mindset to partner with engineers, designers, and PMs from multiple teams to co-create impactful solutions while supporting system contributions.
- Communicates clearly, presents ideas well, and can influence key stakeholders at manager, director, and VP levels.
- Holds a Bachelor’s degree in Computer Science, Software Engineering, or a related field, or can demonstrate equivalent industry experience (4+ years).
- Have prior work experience in Data Platforms.
- Hold experience with big data technologies such as Spark, Hadoop, Flink, Hive, or Kafka, and with both streaming and batching data pipelines.
- Have proven experience with distributed system designs.
- Possess strong general programming and algorithm skills.
- Show strong attention to detail and accuracy in your implementation.
- Have a strong experience writing complex and optimized SQL queries.
- Appreciate a data-driven mindset.