Senior Data Engineer – Crawlers & Orchestration Specialist

Cortex

full-time

Posted on: 3/26/2026

Location Type: Remote

Location: Brazil

✨ AI Apply

About the role

Develop and maintain high-performance, resilient crawlers/bots for large-scale data extraction.
Design and implement complex data pipelines using Databricks (Spark) for batch and streaming processing.
Ensure the health and reliability of data flows using advanced orchestration tools.
Manage and optimize resources within the AWS ecosystem to ensure scalability and cost efficiency.
Implement error-handling techniques, block-workarounds (proxies, captchas) and data quality validation for collected data.

Deep proficiency in Python (focused on scraping libraries such as Scrapy, Playwright, Selenium, or Beautiful Soup).
Solid experience with Databricks and Apache Spark (PySpark).
Experience with services such as S3, Lambda, Glue, Athena, EC2, and EKS.
Advanced knowledge of orchestration tools such as Airflow, Dagster, or Prefect.
Experience with SQL and NoSQL databases, and an understanding of Data Lakehouses (Delta Lake).
Familiarity with Docker, Kubernetes, and CI/CD pipelines.

Benefits

Meal and food vouchers (Vale Refeição and Vale Alimentação).
Gympass/TotalPass.
Home-office allowance.
Health plan and Dental plan (dental optional).
Childcare assistance (up to the child’s 6th birthday).
Extended maternity, paternity, and adoptive leave (#todasasfamíliasimportam / #allfamiliesmatter).
Life insurance.
Birthday Day Off (one day off to take on your birthday or during your birthday month).
Family Day (one day off for parents to take between May and August and enjoy as they wish).
Mental Break (one full week off in December to rest and recharge).
*Benefits according to current policy*

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonScrapyPlaywrightSeleniumBeautiful SoupDatabricksApache SparkSQLNoSQLData Lakehouses