Tech Stack
ApacheAWSAzureCloudJavaPythonScalaSparkSQL
About the role
- Design, develop, and maintain data pipelines using Databricks and Apache Spark
- Integrate data from various sources into Databricks, ensuring quality and consistency
- Optimize Spark jobs for performance and cost efficiency
- Collaborate with data scientists, analysts, and stakeholders to understand requirements and deliver solutions
- Create and maintain data models to support analytics and reporting
- Monitor and troubleshoot data pipelines
- Document processes, architectures, and workflows
- Apply best practices in data engineering and ensure compliance with governance policies
Requirements
- 6+ years of experience in Data Engineering with strong focus on Databricks and Apache Spark
- Proficiency in Python, Scala, or Java
- Experience with cloud platforms (AWS, Azure, or Google Cloud)
- Strong SQL skills for querying and data manipulation
- Familiarity with data warehousing concepts and tools
- Version control experience (Git)
- Strong communication skills for cross-functional collaboration
- Desired: Experience with machine learning frameworks and libraries
- Desired: Knowledge of data visualization tools
- Desired: Familiarity with CI/CD practices for data pipelines
- Languages: Advanced Oral English
- Languages: Native Spanish