Build and extend batch pipelines using dbt for transformations and Dagster for orchestration, scheduling, and asset-driven lineage.
Develop and optimize BigQuery data models (dimensional, wide-table, or domain-oriented) to support analytics, experimentation, and reporting use cases.
Advance real-time streaming capabilities by implementing and maintaining Kafka/PubSub + Flink pipelines, primarily using FlinkSQL, to deliver low-latency datasets and event-derived metrics.
Design data platform standards: SDLC, naming conventions, modeling patterns, incremental strategies, schema evolution approaches, and best practices for batch + streaming including CI/CD and testing.
Improve reliability and observability by implementing monitoring, alerting, and SLAs/SLOs for pipelines and data quality.
Partner with analytics, product, and engineering teams to onboard new data sources, define contracts, and deliver trusted datasets.
Own platform operations including performance tuning, data quality, cost optimization, and scaling across both warehouse and streaming systems.
Design a unified serving layer architecture that cleanly exposes consistent, trusted datasets across both batch and streaming systems.
Establish strong data governance, reliability standards, and observability practices.

Requirements

Strong proficiency in SQL (advanced querying, performance considerations, data modeling).
Hands-on experience with dbt (models, tests, sources, macros, snapshots, incremental strategies).
Experience with batch orchestration tooling Dagster/Airflow (assets/jobs, schedules/sensors, partitioning, backfills, observability).
Proficiency in Python for data engineering tasks (pipeline glue code, libraries, tooling, testing).
Deep familiarity with BigQuery or equivalent cloud native data warehouse tooling (partitioning/clustering, cost/performance optimization, best practices).
Solid experience with GCP (AWS/Azure) infrastructure (core services, IAM, security practices, deployments/automation).
Strong engineering fundamentals: version control, testing, code review, documentation, and operational ownership.
Nice to have: Experience with data quality tooling and patterns (e.g., anomaly detection, expectation-based testing, lineage).
Nice to have: Experience designing semantic layers or metrics layers for analytics.
Nice to have: Familiarity with event-driven architectures, schema registries, CDC patterns, and schema evolution strategies.
Nice to have: Experience building or maintaining streaming data pipelines with Kafka and Apache Flink, including FlinkSQL.
Nice to have: Experience with IaC (e.g., Terraform) and CI/CD for data platforms.
Nice to have: Understanding of privacy/security controls (PII handling, access controls, auditability).

Benefits

Competitive salary and stock options
Flexible vacation policy and a culture that values time for rest and recharging
Remote-first work environment with unique virtual and in-person events to foster team connection
Comprehensive health, dental, and vision insurance—we're a healthcare company that prioritizes your health
16 weeks of parental leave for all parents

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

SQLdbtFlinkFlinkSQLBigQueryPythonGCPKafkaDagsterCI/CD

Soft Skills

collaborationdata governancereliability standardsobservability practicesperformance tuningdata qualitycost optimizationscalingdocumentationoperational ownership