Notion

Software Engineer, Enterprise Data Platform

Notion

full-time

Posted on:

Location Type: Hybrid

Location: San FranciscoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $230,000 - $300,000 per year

About the role

  • Design and evolve the data lakehouse
  • Build and operate core lakehouse components (e.g., Iceberg/Hudi/Delta tables, catalogs, schema management) that serve as the source of truth for analytics, AI, and search.
  • Own critical data pipelines and services
  • Design, implement, and harden batch and streaming pipelines (Spark, Kafka, EMR, etc.) that move and transform data reliably across regions and cells.
  • Advance EKM and encryption-by-design
  • Work with Security and platform teams to integrate Enterprise Key Management (EKM) into data workflows, including file- and record-level encryption and safe key handling in Spark and storage systems.
  • Improve data access, auditability, and residency
  • Build primitives for fine-grained access control, auditing, and data residency so customers can see who accessed what, where, and under which guarantees.
  • Drive reliability and observability
  • Raise the operational bar for our data stack: improve on-call experience, debugging, and alerting for data jobs and services.
  • Optimize large-scale performance and cost
  • Tackle performance and cost challenges across Kafka, Spark, and storage for very large workspaces (20k+ users, multi-cell deployments), including cluster migrations and workload tuning.
  • Enable ML and search workflows
  • Build infrastructure to support training and inference pipelines, ranking workflows, and embedding infrastructure on top of the shared data platform.
  • Shape the platform roadmap
  • Contribute to design docs and evaluations that influence our long-term platform direction and vendor choices.

Requirements

  • 5+ years building and operating data platforms or large-scale data infrastructure for SaaS or similar environments.
  • Strong skills in at least one of Python, Java, or Scala; comfortable working with SQL for analytics and data modeling.
  • Hands-on experience with Spark or similar distributed processing systems, including debugging and performance tuning.
  • Experience with Kafka or equivalent streaming systems; familiarity with CDC/ingestion patterns (e.g., Debezium, Fivetran, custom connectors).
  • Experience with data lakes and table formats (Iceberg, Hudi, or Delta) and/or data catalogs and schema evolution.
  • Practical understanding of access control, encryption at rest/in transit, and auditing as they apply to data platforms.
  • Experience with at least one major cloud provider (AWS, GCP, or Azure) and managed data/compute services (e.g., EMR, Dataproc, Kubernetes-based compute).
  • Comfortable owning services and pipelines in production, including on-call, incident response, and reliability improvements.
Benefits
  • Health insurance
  • 401(k) matching
  • Flexible work arrangements
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonJavaScalaSQLSparkKafkaEMRIcebergHudiDelta
Soft Skills
reliability improvementsincident responsedebuggingperformance tuningon-call experience