Knowledge Base
Nodes to retrieve information related to the query
Knowledge bases allow you to build LLM workflows with your own data. There are two ways to create a knowledge base with VectorShift - either through the Knowledge Base Reader node or through the Semantic Search node.
A Knowledge Base Reader node queries vectors that have already been stored permanently in the database. A Semantic Search node loads documents into a temporary semantic database and allows querying to find the most similar documents.
Use a Knowledge Base Reader node when you want to query previously loaded data that has already been loaded.
Use a Semantic Search node when your pipeline loads new data at run time for querying.
Knowledge bases allow you to return the most relevant pieces of information to a model. This is important for a few reasons:
LLMs models have context window lengths, meaning there is a maximum amount of data they can take in at once. If you are looking to query over large amounts of data (e.g., your Google Drive, OneDrive, Notion pages), you won't be able to do so without creating a knowledge base.
Even if you are able to fit all of the data within the context window, multiple articles have shown that passing in a lot of data can increase hallucinations (e.g., the model not utilizing the most relevant information to answer the question).
From a technical perspective, knowledge bases are referred to as "Vector Databases". See this blog post for a detailed explanation of vector database.
Knowledge Base Reader
The Knowledge Base Reader has one input edge "query" and one output edge "result". The Vector Store reader links to an existing knowledge base that you have defined (use the pull down menu within the node to find existing vector stores). You can create a new knowledge base by clicking "Create a New Knowledge Base".
How to use the Knowledge Base Reader?
Within knowledge bases, you can store data of various types (e.g., static files, URLs, recursive URLs (to scrape all the subpages of a site), data from integrations (e.g., google drive, one drive, notion), transcripts of youtube videos, and more! For URLs, you will have the option to re-scrape the URLs at selected time intervals. For integrations, we automatically live-sync the data so that your knowledge base is always up to date.
A common structure of using the Knowledge Base Reader is below. A question is used to query the Knowledge Base Reader, which returns the most relevant pieces of information from the entire corpus of information. Then, the relevant pieces of information is passed to an LLM to utilize as Context when it is answering the question.
Another common structure is having a LLM generate a search query and connecting the output of the LLM to the Query input edge of the Knowledge Base Reader. In certain cases, this can increase the relevancy of the data coming back from Knowledge Bases.
Example System and Prompt when working with Knowledge Base Reader
To search over a knowledge base, you can employ an LLM to address the users’ inquiries. Below are example System and Prompt for this use case
System
Prompt
Creating a Knowledge Base
When you click "Create a Knowledge Base" on the Knowledge Base Reader node, it brings up a screen when you can name and give a description to a Knowledge Base.
Under "Advanced Settings", you can also choose the chunk size, chunk overlap, hybrid search (discussed later), and embedding model.
Chunk size: when the data is loaded into the knowledge base, the database will chunk the data, or cut the data into pieces. Here, you control the size of each chunk. The chunk size is defaulted to 400 tokens or 1,600 characters (4 characters = 1 token). Decreasing the chunk size sometimes can be a remedy for returning more relevant information to an LLM (if a chunk is too large, the LLM can get confused with the large amount of data it has to reason with).
Chunk overlap: the chunk overlap is the number of tokens overlap between chunks. This is defaulted to 0 tokens. Increase the chunk overlap if you are concerned that chunking is getting rid of important data (e.g., if a chunk cuts in the middle of a word).
Embedding Model: this is the model that is used to embed the data into the knowledge base / vector database. This is defaulted to "text-embeedding-3-small" which is the state of the art model today in terms of performance and speed.
Knowledge Base Query Options
You can change various query parameters to improve your search. You can do this by clicking on the gear on the node.
Max Chunks Per Query: This parameter controls how many chunks or documents are returned from the vector database. For example, you may increase the number of chunks if you are finding that the LLM is not doing an adequate job of answer the user question based on the returned context. This is defaulted to two chunks.
Enable Filter: Enables filtering documents retrieved from the database based on document metadata. See how to filter documents.
Rerank Documents: Performs an additional Reranking step to reorder the documents by relevance to the query.
Note: Reranking incurs a latency cost but can improve query results, especially when returning a high amount of chunks.
Reranking is defaulted to off.
Retrieval Unit: Choice whether to return document information or chunks from the knowledge base. Using documents as a retrieval unit can be helpful when you want to help a user find documents to answer their question rather than the specific information.
Do NL Metadata Query: Whether to use an LLM to automatically generate a metadata filter on your knowledge base. For example can enable queries like "Find all pdf documents?".
Transform Query: Rephrase query using an LLM to better match your search
Answer Multiple Questions: Use an LLM to breakdown your query into multiple questions. This is helpful because embedding multiple questions as a single query can make it harder to retrieve the relevant information for each one.
Expand Query: Uses an LLM to perform query expansion. Will generate multiple restatements of your query and run vector search independently on each of them. Query expansion can improve your results by retrieving content relevant to synonymous versions of your original query.
Metadata Filtering
Using document metadata for filtering documents can improve the relevance of the returned documents.
Checking the "Enable Filter" box allows you to input an additional filter query.
You can specify a filter query using query syntax Qdrfant filter syntax. To filter a specific metadata field , specify the field name and the desired value. For example if we have a collection of books summaries with metadata stored in a vector database we can query for all books in the "mystery" genre". Note: Documents in the database may have metadata fields automatically extracted.
Semantic Search
The Semantic Search node has two input edges: "query" and "documents" and one output edge "result". The Vector Query node accepts documents as input, usually loaded through a dataloader (e.g., URL loader, youtube video loader) and stores it in a temporary vector database at run time. Within the gear of the node, you can also control how many chunks are returned, enable meta data filtering, and rerank documents. See below for a common pipeline architecture using Semantic Search
Last updated