Data Engineer

• Building and maintaining backend data ingestion and embedding pipelines
• Setting up environments, clone repositories, and running pipelines in JupyterHub
• Working on large-scale ETL processes, including converting Iceberg tables to Parquet and exporting data to S3 buckets
• Designing and optimizing schemas for Neo4j-based graph solutions
• Integrating knowledge workflows and KB articles into graph structures for advanced retrieval
• Troubleshooting data quality issues and optimizing Spark jobs for efficiency
• Implementing retry mechanisms and debugging full-stack issues related to large file operations
• Managing secure access using JWT and Kerberos authentication
• Handling credentials for Oracle DB and API clients via HashiCorp Vault
• Working with GitLab for source control and Jira for project tracking
• Supporting migration efforts from Azure DevOps to GitLab/Jira environments

Senior Data Engineer – AI Solutions

Job Level

Tech Stack

About the role

Requirements

Applicant Tracking System Keywords

Hard skills