Implement AI data pipelines that bring together structured, semi-structured and unstructured data to support AI and Agentic solutions
Pre-processing with extraction, chunking, embedding and grounding strategies to get the data ready
Develop AI-driven systems to improve data capabilities and ensure compliance with industry best practices
Implement efficient Retrieval-Augmented Generation (RAG) architectures and integrate with enterprise data infrastructure
Collaborate with cross-functional teams to integrate solutions into operational processes and systems
Design, build and maintain scalable and robust real-time data streaming pipelines using Apache Kafka, AWS Kinesis, Spark streaming, or similar
Develop data domains and data products for various consumption archetypes including Reporting, Data Science, AI/ML, Analytics
Ensure reliability, availability, and scalability of data pipelines through effective monitoring, alerting, and incident management
Implement best practices in reliability engineering, including redundancy, fault tolerance, and disaster recovery
Collaborate closely with DevOps and infrastructure teams to ensure seamless deployment and maintenance of data systems
Mentor junior team members and promote best practices, standards, and reusable patterns
Develop graph database solutions for complex data relationships supporting AI systems
Apply AI solutions to insurance-specific data use cases and partner with architects and stakeholders to implement the vision while safeguarding integrity and scalability
Requirements
Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field
8+ years of data engineering experience including Data solutions, SQL and NoSQL, Snowflake, ETL/ELT tools, CICD, Bigdata, Cloud Technologies (AWS/Google/AZURE), Python/Spark, Datamesh, Datalake or Data Fabric
3+ years of AI/ML experience, with 1+ years of data engineering experience focused on supporting Generative AI technologies
Strong hands-on experience implementing production ready enterprise grade AI data solutions
Experience with prompt engineering techniques for large language models
Experience in implementing Retrieval-Augmented Generation (RAG) pipelines, integrating retrieval mechanisms with language models
Experience of vector databases and graph databases, including implementation and optimization
Experience in processing and leveraging unstructured data for AI applications
Proficiency in implementing scalable AI driven data systems supporting agentic solution (AWS Lambda, S3, EC2, Langchain, Langgraph)
Strong programming skills in Python and familiarity with deep learning frameworks such as PyTorch or TensorFlow
Experience with cloud platforms (AWS, GCP, or Azure) and containerization technologies (Docker, Kubernetes)
Experience implementing data governance practices, including Data Quality, Lineage, Data Catalogue capture
Strong written and verbal communication skills
Benefits
Other rewards may include short-term or annual bonuses
Long-term incentives
On-the-spot recognition
Hybrid work schedule with expectation of working in an office location 3 days a week (Tuesday through Thursday)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data engineeringSQLNoSQLSnowflakeETLELTPythonSparkAI/MLprompt engineering