Upbound

Data Engineer – AI

Upbound

full-time

Posted on:

Location Type: Remote

Location: Remote • 🏈 Anywhere in North America

Visit company website
AI Apply
Apply

Job Level

SeniorLead

Tech Stack

AirflowCloudElasticSearchKubernetesSpark

About the role

  • Define and drive the technical vision for data platforms that support AI-powered features in Crossplane and Upbound Spaces
  • Lead the design of data pipelines that transform infrastructure and data into training datasets for ML models
  • Architect vector search and RAG systems that leverage Crossplane Control Planes & Upbound Marketplace as a knowledge store
  • Build data infrastructure that processes resources, extensions, and compositions for semantic search
  • Establish frameworks for collecting, processing, and analyzing infrastructure configuration data
  • Design data pipelines that handle Crossplane-specific data
  • Create infrastructure for indexing and searching Upbound Marketplace content, documentation, and community patterns
  • Develop metrics and monitoring for AI features integrated with Upbound's control plane architecture
  • Design data systems that power AI agents for infrastructure provisioning & operations, helping users generate and optimize Crossplane compositions
  • Create feature engineering platforms that extract signals from control plane operations, resource status, and reconciliation patterns
  • Implement data infrastructure for training models that predict infrastructure failures, optimize resource allocation, and suggest configuration improvements
  • Drive the development of knowledge graph representations of infrastructure dependencies and relationships

Requirements

  • 10+ years of software/data engineering experience with at least 4 years in technical leadership roles
  • Proven track record building data platforms that support production systems at scale
  • Deep expertise in both traditional data engineering (Spark, Airflow, data lakes) and ML-specific infrastructure (feature stores, model serving)
  • Experience with vector databases (Pinecone, Weaviate, Qdrant, Milvus, pgvector, Opensearch, ElasticSearch)
  • Demonstrated experience with LLM applications, including RAG architectures and semantic search implementations
  • Understanding of Kubernetes, cloud-native architectures, and infrastructure-as-code principles
Benefits
  • Health insurance
  • Retirement plans
  • Paid time off
  • Flexible work arrangements
  • Professional development

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
data engineeringmachine learningdata pipelinesfeature engineeringinfrastructure provisioningsemantic searchvector searchmonitoringdata processingknowledge graph
Soft skills
technical leadershipcommunicationproblem-solvingcollaborationstrategic vision