Salary
💰 $135,040 - $202,560 per year
Tech Stack
ApacheAssemblyAWSAzureCloudDockerDynamoDBEC2ETLGoogle Cloud PlatformKafkaKubernetesNeo4jNoSQLPythonPyTorchSparkSQLTensorflow
About the role
- Implementing AI data pipelines that bring together structured, semi-structured and unstructured data to support AI and Agentic solutions
- Pre-processing with extraction, chunking, embedding and grounding strategies to get the data ready
- Develop AI-driven systems to improve data capabilities, ensuring compliance with industry best practices
- Implement efficient Retrieval-Augmented Generation (RAG) architectures and integrate with enterprise data infrastructure
- Collaborate with cross-functional teams to integrate solutions into operational processes and systems supporting various functions
- Design, build and maintain scalable and robust real-time data streaming pipelines using technologies such as Apache Kafka, AWS Kinesis, Spark streaming, or similar
- Develop data domains and data products for various consumption archetypes including Reporting, Data Science, AI/ML, Analytics
- Ensure the reliability, availability, and scalability of data pipelines and systems through effective monitoring, alerting, and incident management
- Implement best practices in reliability engineering, including redundancy, fault tolerance, and disaster recovery strategies
- Collaborate closely with DevOps and infrastructure teams to ensure seamless deployment, operation, and maintenance of data systems
- Mentor junior team members and engage in communities of practice to deliver high-quality data and AI solutions while promoting best practices, standards, and adoption of reusable patterns
- Develop graph database solution for complex data relationships supporting AI systems
- Apply AI solutions to insurance-specific data use cases and challenges
- Partner with architects and stakeholders to influence and implement the vision of the AI and data pipelines while safeguarding the integrity and scalability of the environment
Requirements
- Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field
- 8+ years of data engineering experience including Data solutions, SQL and NoSQL, Snowflake, ETL/ELT tools, CICD, Bigdata, Cloud Technologies (AWS/Google/AZURE), Python/Spark, Datamesh, Datalake or Data Fabric
- 3+ years of AI/ML experience, with 1+ years of data engineering experience focused on supporting Generative AI technologies
- Strong hands-on experience implementing production ready enterprise grade AI data solutions
- Experience with prompt engineering techniques for large language models
- Experience in implementing Retrieval-Augmented Generation (RAG) pipelines, integrating retrieval mechanisms with language models
- Experience of vector databases and graph databases, including implementation and optimization
- Experience in processing and leveraging unstructured data for AI applications
- Proficiency in implementing scalable AI driven data systems supporting agentic solution (AWS Lambda, S3, EC2, Langchain, Langgraph)
- Strong programming skills in Python and familiarity with deep learning frameworks such as PyTorch or TensorFlow
- Experience with building AI pipelines that bring together structured, semi-structured and unstructured data including pre-processing with extraction, chunking, embedding and grounding strategies, semantic modeling
- Experience in vector databases, graph databases, NoSQL, Document DBs (e.g., AWS open search, GCP Vertex AI, Neo4j, Spanner Graph, Neptune, Mongo, DynamoDB)
- Experience in implementing data governance practices including Data Quality, Lineage, Data Catalogue capture on a large-scale data platform
- Experience with cloud platforms (AWS, GCP, or Azure) and containerization technologies (Docker, Kubernetes)
- Strong written and verbal communication skills and ability to explain technical concepts to various stakeholders
- Experience in multi cloud hybrid AI solutions
- AI Certifications (preferred)
- Experience in P&C or Employee Benefits industry (preferred)
- Knowledge of natural language processing (NLP) and computer vision technologies
- Contributions to open-source AI projects or research publications in the field of Generative AI (preferred)
- Experience in mentoring and developing Junior AI or Data Engineers
- Ability to lead in a lean, agile, and fast-paced organization leveraging Scaled Agile principles
- Candidate must be authorized to work in the US without company sponsorship
- The company will not support the STEM OPT I-983 Training Plan endorsement for this position