Runtalent

Data Engineer – Mid-level

Runtalent

full-time

Posted on:

Location Type: Remote

Location: Brazil

Visit company website

Explore more

AI Apply
Apply

About the role

  • Design and build scalable ETL/ELT data pipelines using dbt, PySpark and other modern transformation tools.
  • Develop and maintain data ingestion pipelines for GenAI workloads, including document processing, chunking and embedding workflows.
  • Orchestrate workflows using Airflow, Dagster or cloud-native orchestration tools.
  • Plan and execute data migration projects, including source data analysis, schema mapping, validation and rollback strategies.
  • Implement Change Data Capture (CDC) solutions using industry-standard tools.
  • Build and maintain data quality frameworks with automated tests and validations.
  • Ensure data governance, security and compliance, including proper handling of PII (personally identifiable information) and enforcement of RBAC (role-based access control) policies.
  • Collaborate with AI Engineers and Full-Stack Developers to support RAG pipelines and GenAI-based applications.
  • Apply event-driven architecture concepts to design scalable and reliable data processing solutions.

Requirements

  • Proven experience developing and deploying production-scale data pipelines.
  • Strong proficiency in Python, PySpark and advanced SQL (window functions, CTEs, performance optimization).
  • Hands-on experience with data migration projects.
  • Experience with at least one major cloud platform (AWS, Azure or GCP).
  • Experience with Databricks, AWS data services or Microsoft Fabric for pipeline development.
  • Experience with modern data warehouses such as Snowflake, BigQuery, Redshift or Databricks.
  • Experience with relational databases (PostgreSQL, MySQL) and NoSQL databases (MongoDB, DynamoDB).
  • Experience with data migration tools for on-premises or cloud environments (e.g., SSIS).
  • Practical experience with Apache Spark / PySpark and workflow scheduling (AWS Glue or similar).
  • Familiarity with Infrastructure-as-Code and containerization tools (Terraform, Docker).
  • Experience with CI/CD pipelines (preferably GitHub Actions).
  • Strong knowledge of data modeling (Star Schema, Data Vault, Dimensional Modeling).
Benefits
  • Remote work 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
ETLELTdbtPySparkPythonSQLdata migrationdata modelingChange Data Capturedata quality frameworks