Design and implement data ingestion, transformation, and enrichment pipelines across multiple concurrent projects with varying data modalities (time-series sensor data, video, images, documents, and metadata).
Develop and manage cloud-native data services including object storage workflows, vector database integration, and structured data warehousing to support multi-modal AI/ML systems.
Work closely with AI/ML engineers to operationalize data pipelines that feed training, inference, and retrieval-augmented generation (RAG) workloads in production.
Establish data quality, lineage, and governance practices across projects that are maturing from prototype to product, bringing structure and repeatability to evolving data ecosystems.
Support the processing and organization of unstructured data (video files, PDFs, technical manuals) into formats suitable for embedding generation, semantic search, and summarization.
Present technical approaches and data architecture decisions to both technical teammates and non-technical stakeholders.

Requirements

Bachelor's Degree in Computer Science, Data Science, Information Systems, Engineering, or a comparable technical discipline.
An active Secret clearance or the ability to obtain
2-4+ years of professional experience in data engineering, data platform development, or a closely related technical role.
Relevant cloud or data engineering certifications are a plus (e.g., AWS Certified Data Engineer, Databricks Data Engineer Associate, AWS Solutions Architect, or equivalent)
Strong proficiency in Python for data engineering (scripting, pipeline development, data transformation).
Experience designing and building ETL/ELT pipelines for structured and semi-structured data in cloud environments.
Experience with AWS cloud services for data workflows (S3, RDS, DynamoDB, EC2/ECS, and related services).
Hands-on experience with at least one distributed data processing framework (Databricks, Spark, Dask, Ray, or equivalent).
Demonstrated ability to work with diverse data modalities (time-series, sensor telemetry, image, video, unstructured text).
Experience with SQL and data warehousing concepts (schema design, partitioning, incremental processing).
Strong experience with data pipeline orchestration, scheduling, and monitoring in production environments.
Experience building data pipelines that ingest, transform, or serve data through RESTful APIs.
Strong communication skills with the ability to explain data architecture decisions to both ML engineers and non-technical stakeholders.

Benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonETLELTdata transformationdata pipeline orchestrationSQLdata warehousingdistributed data processingdata ingestiondata quality

Soft Skills

strong communication skillsability to explain technical conceptscollaboration with technical and non-technical stakeholders

Certifications

AWS Certified Data EngineerDatabricks Data Engineer AssociateAWS Solutions Architect