Plume

Senior Data Engineer, Data and Applied AI

Plume

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $158,000 - $168,000 per year

Job Level

About the role

  • Building and maintaining production-grade data pipelines in cloud data warehouses such as Google BigQuery or equivalent, following architectural standards set by the Director of Data and AI.
  • Designing and developing dbt models across bronze, silver, and gold layers, including a focus on quality and governance via automated tests, documentation, and incremental load strategies.
  • Creating and optimizing Airflow DAGs for data workflow orchestration, including scheduling, dependency management, error handling, and alerting.
  • Implement dimensional data models and data mart structures — guided by the team's modeling standards — that support clinical BI and ML feature consumption.
  • Crafting easy-to-understand visualizations and dashboards that align with commonly used business analytic standards in Looker or equivalent BI tools in close collaboration with product analytics, finance, operations, growth, and clinical stakeholders.
  • Integrating healthcare data from sources such as EHRs, Stripe, 3rd-party APIs, and application database feeds, normalizing incoming data into the unified data platform.
  • Applying HIPAA-compliant data handling practices, including PHI/PII masking, tokenization, audit logging, and role-based access controls across all pipeline and AI system work.
  • Architecting and implementing RAG pipelines — including document ingestion, chunking, embedding generation, and retrieval — using frameworks such as LangChain or LangGraph
  • Supporting MLOps workflows, including model training pipeline maintenance, deployment support, performance monitoring, and retraining triggers.
  • Code reviewing PRs from teammates, providing constructive technical feedback to peers, and upholding the team's engineering standards.
  • Collaborating closely with product managers to understand requirements and deliver reliable data and AI products.
  • Monitoring and triaging assigned pipeline and data quality failures, escalating architectural issues as appropriate.
  • Documenting pipeline designs, data models, and technical decisions in alignment with the team's governance and lineage tracking standards.
  • Evaluating new tools and frameworks, providing hands-on prototyping and technical assessments.

Requirements

  • 5+ years of hands-on experience in data engineering, analytics engineering, or a closely related role.
  • 2+ years of experience working within the healthcare industry, including working knowledge of healthcare data standards, clinical workflows, regulated data environments, and domain-specific data visualizations.
  • Working knowledge of HIPAA — including PHI/PII classification, data masking, audit logging, and access control requirements.
  • Proven production experience with at least one major cloud data warehouse: BigQuery, Snowflake, or Redshift — including advanced SQL and query optimization.
  • Strong hands-on experience with dbt (Core or Cloud), including incremental models, tests, documentation, and multi-environment workflows.
  • Deep experience with Apache Airflow for workflow orchestration, including DAG design, scheduling, monitoring, and failure handling.
  • Demonstrated knowledge of dimensional data modeling — star/snowflake schemas, SCD Types 1/2, fact and dimension table design.
  • Hands-on experience delivering dashboards and reports in at least one enterprise BI tool: Looker, Power BI, Tableau, Qlik, etc.
  • Proficiency in Python for data pipeline development, API integrations, and automation (Pandas, PySpark, or similar).
  • Practical exposure to RAG pipeline development and LLM integration using LangChain, LangGraph, or LlamaIndex
  • Hands-on exposure to MLOps concepts — model deployment, monitoring, and retraining workflows
  • Knowledge of CI/CD tooling for data and AI workloads (GitHub Actions, dbt Cloud CI)
  • Strong understanding of data quality and governance principles: lineage, access controls, data contracts, and automated testing and experience with data governance tools such as OpenMetadata
  • Excellent written and verbal communication skills with the ability to collaborate effectively across engineering, analytics, and clinical teams
  • Ability to work independently on assigned workstreams while keeping the Director and team informed of progress, blockers, and risks
Benefits
  • Ground-Floor Equity (Series B)
  • Free Medical, Dental, and Vision on the first of the month after you start full-time work
  • Unlimited PTO
  • 11 paid holidays and company shut-down for a week in December
  • 401(k)
  • Free Plume and BetterHelp Subscriptions
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data engineeringanalytics engineeringcloud data warehouseSQLdbtApache Airflowdimensional data modelingPythonMLOpsRAG pipeline development
Soft Skills
communicationcollaborationindependencetechnical feedbackproblem-solvingdocumentationmonitoringtriagingevaluating new toolsstakeholder engagement