FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal GenAI Data Engineer
ZscalerPrincipal GenAI Data Engineer architecting enterprise-scale data platforms for AI at Zscaler. Leading the design and implementation of scalable data pipelines for GenAI applications.
Tech Stack
Tools & technologiesPython
About the role
Key responsibilities & impact- Architect enterprise-scale GenAI data platforms for ingestion, transformation, enrichment, and serving of structured and unstructured data
- Design scalable pipelines for enterprise knowledge ingestion from diverse data sources including documents, SaaS platforms, knowledge bases, collaboration tools, and databases
- Define architecture for metadata extraction, chunking, enrichment, embeddings generation, and knowledge preparation workflows
- Design AI-ready data models and storage strategies for vector, graph, and hybrid knowledge systems
- Architect scalable unstructured data processing pipelines for text, images, PDFs, tables, and multimodal content
Requirements
What you’ll need- Expert-level Python programming and software engineering capabilities
- Experience building distributed/scalable data pipelines for AI workloads
- Strong understanding of unstructured data extraction and processing pipelines
- Experience with vector databases, graph databases, and metadata/knowledge storage systems
- Hands-on experience with clustering, entity recognition algorithms, and modern retrieval strategies (including RAG, search, and agentic AI workflows)
Benefits
Comp & perks- Various health plans
- Time off plans for vacation and sick time
- Parental leave options
- Retirement options
- Education reimbursement
- In-office perks, and more!
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Pythondata pipelinesunstructured data processingmetadata extractionclustering algorithmsentity recognitionvector databasesgraph databasesAI workloadsretrieval strategies