Data Engineer

SMASH

full-time

Posted on: 1/9/2026

Location Type: Remote

✨ AI Apply

About the role

You will design and operate scalable ETL and streaming pipelines that process contracts and invoices at high volume with strong data quality guarantees.
This role focuses on building reliable data platforms that power analytics, ROI reporting, and compliance through robust governance, validation, and observability.
Design and maintain ETL pipelines to ingest contracts and invoices from PDF, DOCX, CSV, Excel, and webhook sources.
Build scalable workflows for historical data migrations (10K+ invoices per customer).
Implement real-time streaming pipelines for event-driven integrations.
Develop and manage an analytics data warehouse to support reporting, metrics, and trend analysis.
Model customer-specific datasets for ROI, savings, and exception reporting.
Implement data validation checks for completeness, accuracy, and consistency.
Build data quality monitoring, alerting, and dead-letter queue handling.
Implement PII/PHI detection, masking, and data retention policies (5-year audit trail).
Track data lineage from source through transformation to consumption.
Optimize SQL queries and data models for performance and scalability.
Collaborate with product, engineering, and analytics teams to evolve data requirements.

Strong experience building and operating ETL pipelines in production environments.
Proficiency in Python for data processing (pandas, numpy, pyspark).
Advanced SQL skills with PostgreSQL, including data modeling and query optimization.
Hands-on experience with workflow orchestration tools (Airflow, Prefect, or similar).
Experience designing and operating data warehouses (Redshift, BigQuery, or Snowflake).
Familiarity with streaming platforms such as Kafka or Kinesis.
Experience implementing data quality frameworks (Great Expectations or similar).
Strong understanding of data validation, error handling, and monitoring best practices.
Ability to design scalable systems handling large datasets and schema complexity.

Benefits

We believe in long-lasting relationships with our talent.
We invest time getting to know them and understanding what they seek as their professional next step.
We aim to find the perfect match.
As agents, we pair our talent with our US clients, not only by their technical skills but as a cultural fit.
Our core competency is to find the right talent fast.

Tip: use these terms in your resume and cover letter to boost ATS matches.

ETL pipelinesdata processingPythonpandasnumpypysparkSQLPostgreSQLdata modelingquery optimization

collaborationproblem-solvingattention to detailanalytical thinking