
Data Engineer – AI
Upbound
full-time
Posted on:
Location Type: Remote
Location: Remote • 🏈 Anywhere in North America
Visit company websiteJob Level
SeniorLead
Tech Stack
AirflowCloudElasticSearchKubernetesSpark
About the role
- Define and drive the technical vision for data platforms that support AI-powered features in Crossplane and Upbound Spaces
- Lead the design of data pipelines that transform infrastructure and data into training datasets for ML models
- Architect vector search and RAG systems that leverage Crossplane Control Planes & Upbound Marketplace as a knowledge store
- Build data infrastructure that processes resources, extensions, and compositions for semantic search
- Establish frameworks for collecting, processing, and analyzing infrastructure configuration data
- Design data pipelines that handle Crossplane-specific data
- Create infrastructure for indexing and searching Upbound Marketplace content, documentation, and community patterns
- Develop metrics and monitoring for AI features integrated with Upbound's control plane architecture
- Design data systems that power AI agents for infrastructure provisioning & operations, helping users generate and optimize Crossplane compositions
- Create feature engineering platforms that extract signals from control plane operations, resource status, and reconciliation patterns
- Implement data infrastructure for training models that predict infrastructure failures, optimize resource allocation, and suggest configuration improvements
- Drive the development of knowledge graph representations of infrastructure dependencies and relationships
Requirements
- 10+ years of software/data engineering experience with at least 4 years in technical leadership roles
- Proven track record building data platforms that support production systems at scale
- Deep expertise in both traditional data engineering (Spark, Airflow, data lakes) and ML-specific infrastructure (feature stores, model serving)
- Experience with vector databases (Pinecone, Weaviate, Qdrant, Milvus, pgvector, Opensearch, ElasticSearch)
- Demonstrated experience with LLM applications, including RAG architectures and semantic search implementations
- Understanding of Kubernetes, cloud-native architectures, and infrastructure-as-code principles
Benefits
- Health insurance
- Retirement plans
- Paid time off
- Flexible work arrangements
- Professional development
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data engineeringmachine learningdata pipelinesfeature engineeringinfrastructure provisioningsemantic searchvector searchmonitoringdata processingknowledge graph
Soft skills
technical leadershipcommunicationproblem-solvingcollaborationstrategic vision