Tech Stack
AirflowETLKafkaPythonSparkSQL
About the role
- Build and maintain ETL / streaming pipelines for logs, events, metadata, attribution
- Work on aggregation, feature construction, smoothing, and correctness layers
- Ensure data quality: alerting, reconciliation, schema validation, drift detection
- Optimize for low-latency queries and data access patterns used by enforcement engines and dashboards
- Partner with software engineers and analytics to expose data in APIs or internal tools
- Instrument logging and observability within your pipelines (monitoring, health dashboards)
- Over time, help build feature stores or real-time feature services for models or rules
Requirements
- 3+ years building data pipelines (batch, stream) in production
- Strong in Python, SQL, and data infrastructure (e.g. Kafka, Airflow, Flink, Spark)
- Experience with data integrity, schema evolution, partitioning, compaction, etc.
- Deep understanding of performance, latency, indexing
- Comfortable designing for failure, retries, backfills, idempotency
- You treat data as first-class infrastructure
- Experience in fraud, security, abuse domains
- Familiarity with real-time feature serving or feature store systems
- Exposure to internal tooling APIs or data mesh architectures
- Flexible work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ETLstreaming pipelinesPythonSQLKafkaAirflowFlinkSparkdata integrityschema evolution
Soft skills
data qualitydesigning for failuretreating data as infrastructure