Tech Stack
AWSCloudDockerETLGoGoogle Cloud PlatformHadoopJavaKafkaKubernetesPythonScalaSpark
About the role
- Design, implement, and maintain data pipelines for extracting, transforming, and loading data from a wide variety of sources into various data services
- Maintain and improve current server-side scripts, libraries, APIs, and SDKs
- Identify, design, and implement system performance enhancements and internal process improvements
- Automate manual processes and optimize data delivery
- Build robust and scalable software
Requirements
- 3–4 years of experience in a software engineering environment
- Knowledge of Hadoop, Spark, Kafka, or other equivalent technologies
- Proficiency in at least one of the following languages: Scala, Java, Python, or Golang
- Experience with automated testing systems
- Knowledge of data modeling, data warehousing, ETL processes, and business intelligence reporting tools
- Experience working with CI/CD, containerization, and virtualization tools such as Kubernetes and Docker
- Development experience with Amazon Web Services (AWS) and Google Cloud Platform (GCP) infrastructure and technologies, as well as Data Lakes, is an asset
- Natural curiosity and a demonstrated pattern of exploring multiple approaches to find the most efficient, scalable solution to solve problems
- A passion for Big Data and a strong interest in staying up to date on the latest trends and technologies, constantly researching new tools and data innovations