
Data Engineer – Mid-level
Runtalent
full-time
Posted on:
Location Type: Remote
Location: Brazil
Visit company websiteExplore more
Tech Stack
About the role
- Design and build scalable ETL/ELT data pipelines using dbt, PySpark and other modern transformation tools.
- Develop and maintain data ingestion pipelines for GenAI workloads, including document processing, chunking and embedding workflows.
- Orchestrate workflows using Airflow, Dagster or cloud-native orchestration tools.
- Plan and execute data migration projects, including source data analysis, schema mapping, validation and rollback strategies.
- Implement Change Data Capture (CDC) solutions using industry-standard tools.
- Build and maintain data quality frameworks with automated tests and validations.
- Ensure data governance, security and compliance, including proper handling of PII (personally identifiable information) and enforcement of RBAC (role-based access control) policies.
- Collaborate with AI Engineers and Full-Stack Developers to support RAG pipelines and GenAI-based applications.
- Apply event-driven architecture concepts to design scalable and reliable data processing solutions.
Requirements
- Proven experience developing and deploying production-scale data pipelines.
- Strong proficiency in Python, PySpark and advanced SQL (window functions, CTEs, performance optimization).
- Hands-on experience with data migration projects.
- Experience with at least one major cloud platform (AWS, Azure or GCP).
- Experience with Databricks, AWS data services or Microsoft Fabric for pipeline development.
- Experience with modern data warehouses such as Snowflake, BigQuery, Redshift or Databricks.
- Experience with relational databases (PostgreSQL, MySQL) and NoSQL databases (MongoDB, DynamoDB).
- Experience with data migration tools for on-premises or cloud environments (e.g., SSIS).
- Practical experience with Apache Spark / PySpark and workflow scheduling (AWS Glue or similar).
- Familiarity with Infrastructure-as-Code and containerization tools (Terraform, Docker).
- Experience with CI/CD pipelines (preferably GitHub Actions).
- Strong knowledge of data modeling (Star Schema, Data Vault, Dimensional Modeling).
Benefits
- Remote work 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
ETLELTdbtPySparkPythonSQLdata migrationdata modelingChange Data Capturedata quality frameworks