iBusiness Funding

AI Knowledge Data Engineer

iBusiness Funding

full-time

Posted on:

Location Type: Remote

Location: Remote • Florida • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $180,000 - $240,000 per year

Job Level

Mid-LevelSenior

Tech Stack

AirflowElasticSearchPythonPyTorchSparkTensorflow

About the role

  • Architect, implement, and optimize retrieval-augmented generation (RAG) workflows by integrating local LLMs (e.g., Llama) with retrieval mechanisms (vector search, Elasticsearch, FAISS, Weaviate)
  • Design, build, and maintain scalable data pipelines for ingesting, transforming, indexing, and retrieving structured and unstructured data from diverse sources
  • Design, build, and scale addressable services and tools specifications that can be leveraged by LLMs and Agents to orchestrate workflows
  • Orchestrate and scale training data operations, including data curation, versioning, and lineage tracking for large-scale LLM training and fine-tuning
  • Develop and maintain ontologies, knowledge graphs, and semantic data models to structure and integrate domain-specific knowledge for improved retrieval and reasoning
  • Implement and optimize knowledge retrieval strategies (dense/sparse retrieval, ranking algorithms) to maximize system accuracy and relevance
  • Aggregate disparate knowledge bases and heterogeneous data into a fused approach for access to relevant contextual information
  • Design cognitive memory systems for AI agents, enabling persistent knowledge retention and contextual awareness across interactions
  • Collaborate with AI researchers, data scientists, and engineers to align knowledge architecture with business objectives and ensure data quality
  • Evaluate and integrate new technologies and research advancements in LLMs, RAG, information retrieval, and knowledge representation
  • Maintain clear and comprehensive documentation of models, pipelines, and workflows.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field
  • Proven experience designing and scaling data pipelines and training data workflows for LLMs or similar AI systems
  • Strong background in information retrieval systems, vector search technologies, and RAG frameworks (e.g., FAISS, Pinecone, Elasticsearch, Milvus)
  • Proficiency in programming (Python) and machine learning libraries (TensorFlow, PyTorch)
  • Experience with ontologies, knowledge graphs, and semantic technologies (RDF, OWL, SPARQL)
  • Familiarity with distributed data processing and orchestration tools (e.g., Spark, Airflow, Kubeflow)
  • Excellent analytical, problem-solving, and communication skills
  • Ability to work collaboratively in a cross-functional, fast-paced environment.
Benefits
  • medical, dental, and vision coverage
  • 401(k) with company match
  • paid time off

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
retrieval-augmented generationdata pipelinesinformation retrievalvector searchknowledge graphsontologiesmachine learningprogrammingPythonsemantic technologies
Soft skills
analytical skillsproblem-solvingcommunication skillscollaborationadaptability