Knowledge Base

Knowledge bases allow you to build LLM workflows with your own data. There are two ways to create a knowledge base with VectorShift - either through the Knowledge Base Reader node or through the Semantic Search node.

A Knowledge Base Reader node queries vectors that have already been stored permanently in the database. A Semantic Search node loads documents into a temporary semantic database and allows querying to find the most similar documents.

  • Use a Knowledge Base Reader node when you want to query previously loaded data that has already been loaded.

  • Use a Semantic Search node when your pipeline loads new data at run time for querying.

Knowledge bases allow you to return the most relevant pieces of information to a model. This is important for a few reasons:

  1. LLMs models have context window lengths, meaning there is a maximum amount of data they can take in at once. If you are looking to query over large amounts of data (e.g., your Google Drive, OneDrive, Notion pages), you won't be able to do so without creating a knowledge base.

  2. Even if you are able to fit all of the data within the context window, multiple articles have shown that passing in a lot of data can increase hallucinations (e.g., the model not utilizing the most relevant information to answer the question).

From a technical perspective, knowledge bases are referred to as "Vector Databases". See this blog post for a detailed explanation of vector database.

Knowledge Base Reader

The Knowledge Base Reader has one input edge "query" and one output edge "result". The Vector Store reader links to an existing knowledge base that you have defined (use the pull down menu within the node to find existing vector stores). You can create a new knowledge base by clicking "Create a New Knowledge Base".

How to use the Knowledge Base Reader?

Within knowledge bases, you can store data of various types (e.g., static files, URLs, recursive URLs (to scrape all the subpages of a site), data from integrations (e.g., google drive, one drive, notion), transcripts of youtube videos, and more! For URLs, you will have the option to re-scrape the URLs at selected time intervals. For integrations, we automatically live-sync the data so that your knowledge base is always up to date.

A common structure of using the Knowledge Base Reader is below. A question is used to query the Knowledge Base Reader, which returns the most relevant pieces of information from the entire corpus of information. Then, the relevant pieces of information is passed to an LLM to utilize as Context when it is answering the question.

Another common structure is having a LLM generate a search query and connecting the output of the LLM to the Query input edge of the Knowledge Base Reader. In certain cases, this can increase the relevancy of the data coming back from Knowledge Bases.

Example System and Prompt when working with Knowledge Base Reader

To search over a knowledge base, you can employ an LLM to address the users’ inquiries. Below are example System and Prompt for this use case

System

You are the customer support assistant for a VectorShift, a no-code platform to build LLM applications. You receive Context on VectorShift and you answer the User Question using the Context. You utilize a professional tone and you try and be as concise as possible.

If you are unable to answer the Question, please refer the user to one of the following resources:

1) Contact support email: Xx
2) Calendly link: Xx
3) Documentation: Xx
4) Tutorials: Xx

Prompt

User Message:
{{User_Message}}

___

Context
{{Context}}

Creating a Knowledge Base

When you click "Create a Knowledge Base" on the Knowledge Base Reader node, it brings up a screen when you can name and give a description to a Knowledge Base.

Under "Advanced Settings", you can also choose the chunk size, chunk overlap, hybrid search (discussed later), and embedding model.

  1. Chunk size: when the data is loaded into the knowledge base, the database will chunk the data, or cut the data into pieces. Here, you control the size of each chunk. The chunk size is defaulted to 400 tokens or 1,600 characters (4 characters = 1 token). Decreasing the chunk size sometimes can be a remedy for returning more relevant information to an LLM (if a chunk is too large, the LLM can get confused with the large amount of data it has to reason with).

  2. Chunk overlap: the chunk overlap is the number of tokens overlap between chunks. This is defaulted to 0 tokens. Increase the chunk overlap if you are concerned that chunking is getting rid of important data (e.g., if a chunk cuts in the middle of a word).

  3. Embedding Model: this is the model that is used to embed the data into the knowledge base / vector database. This is defaulted to "text-embeedding-3-small" which is the state of the art model today in terms of performance and speed.

Knowledge Base Query Options

You can change various query parameters to improve your search. You can do this by clicking on the gear on the node.

  • Max Chunks Per Query: This parameter controls how many chunks or documents are returned from the vector database. For example, you may increase the number of chunks if you are finding that the LLM is not doing an adequate job of answer the user question based on the returned context. This is defaulted to two chunks.

  • Enable Filter: Enables filtering documents retrieved from the database based on document metadata. See how to filter documents.

  • Rerank Documents: Performs an additional Reranking step to reorder the documents by relevance to the query.

    • Note: Reranking incurs a latency cost but can improve query results, especially when returning a high amount of chunks.

    • Reranking is defaulted to off.

When creating a Vector Store you can enable the "hybrid search" option. Hybrid search allows you to control the tradeoff between dense (semantic) and lexical (keyword) search.

  • Increasing Alpha emphasizes semantic search

  • Decreasing Alpha emphasizes lexical matching, ie finding exact keywords in the documents

You may want to use hybrid search if specific keywords are relevant in your knowledge base and you want to emphasize the return of chunks containing those keywords if referenced by the user.

Hybrid search is defaulted to off.

Metadata Filtering

Using document metadata for filtering documents can improve the relevance of the returned documents.

Checking the "Enable Filter" box allows you to input an additional filter query.

You can specify a filter query using query syntax similar to MongoDB. To filter a specific metadata field , specify the field name and the desired value. For example if we have a collection of books summaries with metadata stored in a vector database we can query for all books in the "mystery" genre"

{"genre": "mystery"}

The Semantic Search node has two input edges: "query" and "documents" and one output edge "result". The Vector Query node accepts documents as input, usually loaded through a dataloader (e.g., URL loader, youtube video loader) and stores it in a temporary vector database at run time. Within the gear of the node, you can also control how many chunks are returned, enable meta data filtering, and rerank documents. See below for a common pipeline architecture using Semantic Search

Last updated