Lean Tech

Senior Data Engineer

Lean Tech

full-time

Posted on:

Origin:  • 🇨🇴 Colombia

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

ApacheAWSAzureCloudERPETLIoTJavaKafkaNumpyPandasPySparkPythonScalaSparkSQLTableau

About the role

  • Lead and execute large-scale data engineering projects using the Palantir Foundry platform.
  • End-to-end development of robust and scalable ETL/ELT workflows leveraging Spark-based Transforms, the Pipeline Canvas, and Foundry’s low-code/no-code tools.
  • Design, implement, and optimize Foundry Transforms (Code Workbooks) using Apache Spark (Python/Scala) to cleanse, validate, join, and expose datasets via Datasets or Views.
  • Use Foundry’s Pipeline Canvas to wire together data sources, transformations, and outputs in a visual, low-code environment.
  • Ingest and normalize large, disparate datasets from sources such as ERP, CRM, and IoT streams, operating on daily or near real-time cadences.
  • Define and manage Ontologies (business-friendly data models) so analytics teams can self-serve data reliably and intuitively.
  • Leverage Dataset Builder and Object Library to manage schemas, support full or incremental loads, and standardize dataset governance.
  • Integrate data through standard and custom APIs and connectors (e.g., JDBC, S3, Kafka, Snowflake, Foundry’s Dataset Writer/Reader).
  • Implement and maintain robust data quality checks and automated alerts to ensure early detection of anomalies or breaches of defined thresholds.
  • Monitor and tune pipeline performance and resource usage (partitioning strategies, caching, and load balancing) for production environments.
  • Automate deployment pipelines using CI/CD practices to promote Foundry Transforms into production safely and efficiently.
  • Collaborate with cross-functional teams to ensure alignment between data engineering solutions and key business objectives.
  • Provide technical mentorship, ensuring best practices in code quality, version control, testing, and documentation.
  • Bring domain expertise to support aviation-related data challenges (if applicable), but open to broader industry applications.

Requirements

  • Strong hands-on experience developing data pipelines and workflows in Palantir Foundry, including Transforms, Ontology modeling, Workspaces, Actions, and Pipeline Canvas.
  • Deep understanding of Apache Spark APIs, including batch and streaming data processing.
  • Advanced programming proficiency in Python; experience in Scala or Java is a plus.
  • Strong command of SQL and working with structured, semi-structured, and unstructured data.
  • Familiarity with key Python libraries and tools: PySpark, Pandas, NumPy, Great Expectations, Pytest/Unittest.
  • Proven track record with CI/CD practices, preferably deploying to Foundry or cloud-based platforms (AWS, Azure).
  • Understanding of data architecture, performance optimization, and modern ELT principles.
  • Experience with data integration tools and connectors (e.g., JDBC, Kafka, S3, Snowflake).
  • Applied experience with ontology management, dataset structuring, and self-serve enablement.
  • Familiarity with agile methodologies (Scrum, Kanban) and managing operational tickets.
  • Strong documentation, version control, and testing habits for data workflows.
  • Industry experience in aviation is good to have.
  • Familiarity with data visualization tools like Tableau or Power BI is good to have.
  • Certifications in AWS, Azure, or other cloud platforms are good to have.
  • Knowledge of machine learning pipelines or data science workflows is good to have.
  • Experience with data governance, compliance standards, and metadata management is good to have.
  • Background in DevOps or infrastructure-as-code for pipeline orchestration is good to have.
  • Strong analytical and problem-solving mindset with attention to detail and data accuracy.
  • Ability to explain technical concepts clearly to both technical and non-technical stakeholders.
  • Proactive collaborator and effective communicator across multidisciplinary teams.
  • Leadership and mentoring skills to guide junior engineers and contribute to a culture of learning.
  • High adaptability to new technologies, including low-code/no-code tooling environments.
  • Excellent time management and organizational skills, capable of juggling multiple priorities effectively.
Diabetes Youth Families

Analytics Engineer

Diabetes Youth Families
Junior · Midfull-time🇲🇽 Mexico
Posted: 10 days agoSource: insulet.wd5.myworkdayjobs.com
CloudPySparkPythonSQL
Global Payments Inc.

Site Reliability Engineer

Global Payments Inc.
Mid · Seniorfull-timeUtah · 🇺🇸 United States
Posted: 12 days agoSource: tsys.wd1.myworkdayjobs.com
ApacheAWSCloudGoogle Cloud PlatformJavaScriptKafkaPythonSaltStackShell ScriptingSpark
Amperity

Senior Customer Solutions Engineer II

Amperity
Seniorfull-time$132k–$182k / yearColorado, Illinois · 🇺🇸 United States
Posted: 17 days agoSource: boards.greenhouse.io
ApacheCloudSparkSQL
PointClickCare

Principal Enterprise Architect

PointClickCare
Leadfull-time$153k–$170k / year🇨🇦 Canada
Posted: 4 days agoSource: jobs.lever.co
AWSAzureCloudGoogle Cloud PlatformITSMSaltStack
ProArch

Senior Security Consultant

ProArch
Seniorfull-time🇺🇸 United States
Posted: 29 days agoSource: apply.workable.com
AWSAzureCloudCyber SecurityGoogle Cloud PlatformIoTPythonSplunk