Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Jalasoft

Senior Data Engineer – AWS, RAG Pipelines

Jalasoft

Senior Data Engineer designing and operating cloud data infrastructures for AI initiatives. Building data lakes on AWS and real-time pipelines for RAG systems.

Posted 6/12/2026full-timeRemote • 🇨🇴 ColombiaSeniorWebsite

Tech Stack

Tools & technologies
Amazon RedshiftAWSCloudDistributed SystemsElasticSearchETLJavaJavaScript.NETNode.jsPostgresPython

About the role

Key responsibilities & impact
  • Design and operate the cloud data infrastructure powering AI initiatives.
  • Architect production-scale data lakes on AWS.
  • Build real-time ingestion and observability pipelines.
  • Own the vector search and embedding layers that feed RAG systems and autonomous agents.

Requirements

What you’ll need
  • Overall Experience: 7+ years in Data Engineering, Distributed Systems, or Data Architecture
  • AWS & Infrastructure: 4+ years architecting production-scale data lakes, storage tiers, and event streaming
  • AI/LLM Pipelines: 2+ years building RAG systems, managing embeddings, and orchestrating foundational models
  • Proficiency in AWS Data Lake Architecture & Storage
  • Proficiency in Real-Time Observability & Log Analytics
  • Proficiency in Elasticsearch & OpenSearch Optimization, Vectorization, Embeddings
  • Proficiency in Amazon Bedrock & Generative AI Pipelines
  • Proficiency in Software Engineering & API Ingestion
  • Production-level proficiency in one or more of: C# (.NET Core), Java, Python, or Node.js
  • AWS S3 partitioning strategies, lifecycle policies, and columnar formats (Parquet, Iceberg)
  • AWS Glue Data Catalog and Lake Formation for multi-tenant, fine-grained access control
  • Query optimization over petabyte-scale datasets using Amazon Athena and Redshift Spectrum
  • Distributed oTel collector configuration for log, trace, and metrics capture and routing into S3
  • High-volume streaming of system logs, Datadog captures, and raw server events into S3
  • Real-time CDC from PostgreSQL using Debezium or AWS DMS
  • Amazon OpenSearch clusters with simultaneous lexical and high-dimensional vector search
  • OpenSearch index lifecycle management, sharding strategies, and dynamic mappings at scale
  • Amazon Bedrock foundational model APIs (Claude, Titan) for data enrichment, classification, and semantic parsing
  • Knowledge Bases for Amazon Bedrock for automatic chunking, metadata extraction, and vector index syncs from S3
  • ETL/ELT pipelines ingesting unstructured event data from SaaS APIs (e.g., Pendo, Hotjar, Google Analytics)
  • MCP server development to expose data lake context and utilities to AI agents

Benefits

Comp & perks
  • Remote work.
  • 13 floating holiday.
  • 15 vacation days per year completed.
  • Good working environment.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Data EngineeringDistributed SystemsData ArchitectureAWS Data Lake ArchitectureReal-Time ObservabilityLog AnalyticsElasticsearchOpenSearchC#Python