
Data Engineer
MetroStar
full-time
Posted on:
Location Type: Hybrid
Location: Reston • Virginia • United States
Visit company websiteExplore more
Salary
💰 $119,000 - $129,000 per year
About the role
- Design and implement data ingestion, transformation, and enrichment pipelines across multiple concurrent projects with varying data modalities (time-series sensor data, video, images, documents, and metadata).
- Develop and manage cloud-native data services including object storage workflows, vector database integration, and structured data warehousing to support multi-modal AI/ML systems.
- Work closely with AI/ML engineers to operationalize data pipelines that feed training, inference, and retrieval-augmented generation (RAG) workloads in production.
- Establish data quality, lineage, and governance practices across projects that are maturing from prototype to product, bringing structure and repeatability to evolving data ecosystems.
- Support the processing and organization of unstructured data (video files, PDFs, technical manuals) into formats suitable for embedding generation, semantic search, and summarization.
- Present technical approaches and data architecture decisions to both technical teammates and non-technical stakeholders.
Requirements
- Bachelor's Degree in Computer Science, Data Science, Information Systems, Engineering, or a comparable technical discipline.
- An active Secret clearance or the ability to obtain
- 2-4+ years of professional experience in data engineering, data platform development, or a closely related technical role.
- Relevant cloud or data engineering certifications are a plus (e.g., AWS Certified Data Engineer, Databricks Data Engineer Associate, AWS Solutions Architect, or equivalent)
- Strong proficiency in Python for data engineering (scripting, pipeline development, data transformation).
- Experience designing and building ETL/ELT pipelines for structured and semi-structured data in cloud environments.
- Experience with AWS cloud services for data workflows (S3, RDS, DynamoDB, EC2/ECS, and related services).
- Hands-on experience with at least one distributed data processing framework (Databricks, Spark, Dask, Ray, or equivalent).
- Demonstrated ability to work with diverse data modalities (time-series, sensor telemetry, image, video, unstructured text).
- Experience with SQL and data warehousing concepts (schema design, partitioning, incremental processing).
- Strong experience with data pipeline orchestration, scheduling, and monitoring in production environments.
- Experience building data pipelines that ingest, transform, or serve data through RESTful APIs.
- Strong communication skills with the ability to explain data architecture decisions to both ML engineers and non-technical stakeholders.
Benefits
- Health, dental, and vision insurance
- 401(k) retirement plan with company match
- Paid time off (PTO) and holidays
- Parental Leave and dependent care
- Flexible work arrangements
- Professional development opportunities
- Employee assistance and wellness programs
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonETLELTdata transformationdata pipeline orchestrationSQLdata warehousingdistributed data processingdata ingestiondata quality
Soft Skills
strong communication skillsability to explain technical conceptscollaboration with technical and non-technical stakeholders
Certifications
AWS Certified Data EngineerDatabricks Data Engineer AssociateAWS Solutions Architect