Rand Stats

LLM::RetrievalAugmentedGeneration

zef:antononcube

LLM::RetrievalAugmentedGeneration

Actions Status Actions Status

Raku package for doing LLM Retrieval Augment Generation (RAG).


Motivation and general procedure

Assume we have a large (or largish) collection of (Markdown) documents and we want to interact with it as if a certain LLM model has been specially trained with that collection.

Here is one way to achieve this:

  1. The "data wrangling problem" is the conversion of the a collection of documents into Markdown files, and then partitioning those files into text chunks.
    • There are several packages and functions that can do the conversion.
    • It is not trivial to partition texts into reasonable text chunks.
      • Certain text paragraphs might too big for certain LLMs to make embeddings for.
  2. Each of the text chunks is "vectorized" via LLM embedding.
  3. Then the vectors are put in a vector database or "just" into a "nearest neighbors" finding function object.
  4. When a user query is given:
    • The LLM embedding vector is being found.
    • The closest text chunk vectors are found.
  5. The corresponding closest text chunks are given to the LLM to formulate a response to user's query.

Workflow

Here is the Retrieval Augmented Generation (RAG) workflow we consider:

Component diagram

Here is a Mermaid-JS component diagram that shows the components of performing the Retrieval Augmented Generation (RAG) workflow:

flowchart TD
    subgraph LocalVDB[Local Folder]
        A(Vector Database 1)
        B(Vector Database 2)
        C(Vector Database N)
    end
    ID[Ingest document collection]
    SD[Split Documents]
    EV[Get LLM Embedding Vectors]
    CD[Create Vector Database]
    ID --> SD --> EV --> CD

    CD -.- CArray[[CArray<br>representation]]

    CD -.-> |export| LocalVDB

    subgraph Creation
        ID
        SD
        EV
        CD
    end

    LocalVDB -.- JSON[[JSON<br>representation]]

    LocalVDB -.-> |import|D[Ingest Vector Database]
 
    D -.- CArray
    F -.- |nearest neighbors<br>distance function|CArray
    D --> E
    E[/User Query/] --> F[Retrieval]
    F --> G[Document Selection]
    G -->|Top K documents| H(Model Fine-tuning)
    H --> I[[Generation]]
    I <-.-> LLM{{LLM}}
    I -->J[/Output Answer/]
    G -->|Top K passages| K(Model Fine-tuning)
    K --> I

    subgraph RAG[Retrieval Augmented Generation]
        D 
        E
        F
        G
        H
        I
        J
        K
    end

In this diagram:


Implementation notes


References

Packages

[AAp1] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.

[AAp2] Anton Antonov, WWW::PaLM Raku package, (2023), GitHub/antononcube.

[AAp3] Anton Antonov, LLM::Functions Raku package, (2023-2024), GitHub/antononcube.

[AAp4] Anton Antonov, LLM::Prompts Raku package, (2023-2024), GitHub/antononcube.

[AAp5] Anton Antonov, ML::FindTextualAnswer Raku package, (2023-2024), GitHub/antononcube.

[AAp6] Anton Antonov, Math::Nearest Raku package, (2024), GitHub/antononcube.

[AAp7] Anton Antonov, Math::DistanceFunctions::Native Raku package, (2024), GitHub/antononcube.

[AAp8] Anton Antonov, ML::StreamsBlendingRecommender Raku package, (2021-2023), GitHub/antononcube.