GitLab

Distinguished Data Systems Architect, Data Engineering

GitLab

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $219,100 - $328,700 per year

Job Level

Mid-LevelSenior

Tech Stack

AirflowCloudDockerKubernetesOpen SourcePostgresPython

About the role

  • Drive architectural vision for scalable, distributed data systems across SaaS and self-managed deployments, designing database solutions that balance OLTP/OLAP performance, scalability, and cost-efficiency
  • Establish enterprise data governance frameworks including lineage, quality controls, versioning, and compliance practices that meet regulatory requirements across global markets
  • Architect monetizable data services and APIs with semantic models serving internal analytics and external product offerings, enabling new revenue streams while maintaining security and performance SLAs
  • Create a cohesive architectural blueprint of GitLab's data ecosystem, identifying gaps against modern platforms and establishing opinionated design principles grounded in proven cloud-native patterns
  • Design event-driven architectures and end-to-end data lifecycle systems spanning ingestion, orchestration (Argo, Airflow, Kubernetes), transformation workflows, and unified metadata management with comprehensive observability
  • Partner with product and engineering leadership to embed AI-driven patterns into data infrastructure and align senior engineering leaders on common design tenets and platform standards
  • Transform ambiguous business challenges into strategic technical roadmaps, leading high-stakes architectural engagements where data platforms create measurable competitive differentiation

Requirements

  • Experience architecting large-scale distributed data systems in complex, regulated domains with unified platforms integrating cloud-native compute, orchestration, and semantic modeling
  • Demonstrated leadership building multi-modal data services with strong developer experience principles, focusing on monetization, governance, and data product lifecycle management
  • Hands-on expertise with modern data stack technologies including Python, Docker, Airflow, Trino, Postgres, distributed query engines, and graph-based metadata systems
  • Advanced knowledge bridging cloud and on-premises deployments with automation, developer self-service focus, and data integration through connector marketplaces
  • Deep understanding of data processing paradigms and standards including synchronous vs. asynchronous processing, schema management, logical data modeling, and formats like OpenTelemetry, OpenMetadata, and OpenLineage
  • Experience with AI-driven architectures and emerging technologies including model orchestration, agentic patterns, and standards like MCP (Model Context Protocol)
  • Strong architectural opinions on cost-aware, resilient solutions that optimize entire data lifecycle decisions with focus on scalability and performance trade-offs
  • Passion for open source platforms, team mentorship, and collaborative values with ability to build scalable solutions that align with organizational culture and technical excellence
  • Design and implement Model Driven Architecture (MDA) framework to establish clear separation between logical/conceptual data models and platform-specific physical implementations, enabling agility and reducing technical debt across enterprise data systems
Benefits
  • Benefits to support your health, finances, and well-being
  • Flexible Paid Time Off
  • Team Member Resource Groups
  • Equity Compensation & Employee Stock Purchase Plan
  • Growth and Development Fund
  • Parental leave
  • Home office support

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
architecting large-scale distributed data systemsdata governance frameworksevent-driven architecturesdata lifecycle systemssemantic modelingdata product lifecycle managementdata processing paradigmsModel Driven Architecture (MDA)cost-aware solutionsdata integration
Soft skills
leadershipteam mentorshipcollaborative valuesstrategic technical roadmapstransforming business challengesbuilding scalable solutionsaligning engineering leaderscommunicating architectural visionembedding AI-driven patternsfocusing on developer experience