Tech Stack
AWSAzureCloudETLJavaPythonRokuScalaSpark
About the role
- Design, develop, and maintain scalable and moderately complex data pipelines using Databricks, Spark, and related technologies
- Optimize data workflows to build performant, reliable, and efficient data solutions
- Contribute positively to a highly collaborative work environment by working closely with cross-functional teams, including data scientists, analysts, and software engineers; understand requirements and implement robust solutions
- Collaborate with other data engineers to ensure the smooth deployment and operation of data solutions
- Monitor, troubleshoot, and resolve issues related to data pipelines and platform performance
- Implement best practices for data modeling, ETL processes, and data ingestion; stay current with industry trends and advancements
- Create and maintain detailed documentation of data solutions for knowledge sharing and future reference
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field
- 2-4 years of experience as a Data Engineer, with a strong focus on Databricks, Spark, and cloud-based data platforms (e.g., AWS, Azure, Google Cloud)
- Expertise in designing, developing, and optimizing complex ETL processes and data pipelines
- Proficiency in programming languages like Python, Scala, or Java
- Solid understanding of data modeling, data warehousing, and data integration concepts
- Familiarity with DevOps practices and tools for CI/CD and infrastructure automation
- Strong problem-solving skills and ability to troubleshoot and resolve complex technical issues
- Excellent communication and teamwork skills, with the ability to collaborate effectively with cross-functional teams
- Experience with big data technologies, real-time data processing, and machine learning pipelines is a plus
- Familiarity with weather data is a plus
- Relevant certifications in Databricks, Spark, or cloud platforms are a strong advantage