Tech Stack
AirflowAmazon RedshiftAWSCloudCyber SecurityJavaMySQLPythonScalaSparkSQL
About the role
- Develop high volume data processing jobs across various platforms including Redshift, Athena, Iceberg, Spark and Airflow
- Work closely with Data Scientists to turn theory and methodology into viable products
- Identify bottlenecks in system performance and make recommendations on improving process efficiency
- Support and collaborate with data scientists, analysts, and product managers
- Mentor and guide junior team members on architectural principles, best practices, and effective ways of working
- Help process Ookla's global collection of data on mobile and fixed internet performance
Requirements
- Bachelor's degree in Computer Science or related field or equivalent experience
- Strong understanding of programming concepts with experience in at least 1 OOP language (Python, Scala, Java)
- Strong background and 3+ years of direct experience working with Python
- Experience with high volume distributed data processing in Spark
- Strong working knowledge of SQL, databases (MySQL / Redshift / Athena / Trino), data modeling, and modern data warehouse (lakehouse) management
- Excellent troubleshooting and analysis skills
- Excited to learn and work with new technologies in a dynamic environment
- Strong communication skills both in person and via virtual mediums
- Strong time management skills and a self driven work ethic
- Experience with AWS cloud services is a plus
- Experience with event driven architecture and streaming data pipelines is a plus
- Background check required