Knowledge Bases
Interact with Knowledge Bases through Python classes.
Knowledge Bases are a type of database/storage system offered by the VectorShift platform that allows you to store various kinds of data, such as text, files, and scraped URLs, into (one or more) vector embeddings that represent the meaning of the shared data. We suggest reading the platform documentation on Knowledge Bases to gain an appropriate context.
The SDK offers an interface atop the API endpoints through a class to easily fetch, manipulate, and save Knowledge Base. Since some methods interface with the VectorShift platform, they require API keys. If the API keys have already been set as environment variables, they need not be supplied in those methods. The output of all methods invoking APIs is a dictionary representing the API JSON response.
Creating Knowledge Bases
Represents a Knowledge Base object that may be existing (if an ID is given) or new (if no ID is given).
Note: To work with existing Knowledge Base objects, we suggest using thefetch
method instead.
Arguments:
name
: The name of the Knowledge Base.description
: The description of the Knowledge Base.chunk_size
: The chunk size of the Knowledge Base (the default size, in bytes, of each unit of information uploaded).chunk_overlap
: The chunk overlap of the Knowledge Base (the default number of bytes of overlap between each unit of information uploaded). For instance, if the total data size is 1000 tokens, a size of500
and overlap of0
will give 2 documents (tokens 1-500, 501-1000), while an overlap of250
gives 3 (tokens 1-500, 250-750, 501-1000).is_hybrid
: Whether the Knowledge Base supports hybrid search. Hybrid search allows you to control the tradeoff between dense (semantic) and lexical (keyword) search.id
:(Optional) The ID of the Knowledge Base, which, if given, should correspond with a Knowledge Base you already own on the VectorShift platform. If blank, it represents a new Knowledge Base.
fetch
A static method that creates a KnowledgeBase
object representing an existing Knowledge Base on the VectorShift platform, given an ID or name.
Arguments:
base_id
: The ID of the Knowledge Base to fetch.base_name
: The ID of the Knowledge Base being represented. At least one ofbase_id
orbase_name
should be provided. If both are provided, base_name is used to search for the Knowledge Base.username
: (Optional) The username of the user owning the Knowledge Base.org_name
: (Optional) The organization name of the user owning the Knowledge Base, if applicable.api_key
: The VectorShift API key.
save
A method to save or update a KnowledgeBase
object to the VectorShift platform.
Arguments:
update_existing
: Whether or not to save the Knowledge Base as a new object or replace an existing one. If set toTrue
, theKnowledgeBase
should have an ID, and the existing Knowledge Base with the ID will be replaced with the object's data. If set toFalse
, the ID, if any, is ignored and a new Knowledge Base object is created with the object's data.api_key
: The VectorShift API key.
update_metadata
A method to update metadata fields for items in a Knowledge Base. The KnowledgeBase
object should already have an ID.
Arguments:
list_of_item_ids
: The IDs of the items whose metadata is to be updated.list_of_metadata
: The new metadata for all items. It should have the same length aslist_of_item_ids
. For eachi
, thei
-th element oflist_of_metadata
will be the new metadata for the item identified by thei
th element oflist_of_item_ids
.keep_prev
: Whether or not to replace or update the existing metadata for each item. If set toTrue
, additional metadata is added to existing metadata. If set toFalse
, the old metadata is discarded.api_key
: The VectorShift API key.public_key
: The public key to use for authentication, if applicable.private_key
: The private key to use for authentication, if applicable.
update_selected_files
A method to update files associated with an integration. The KnowledgeBase
object should already have an ID.
Arguments:
integration_id
: The ID of the specific integration with associated files in the Knowledge Base. Only files associated with this integration will be updated.keep_prev
: Whether or not to keep the previous versions of the files. If set toTrue
, additional files are added separately.selected_items
: Names of the specific files to update.select_all_items_flag
: If this flag isTrue
, all files associated with the integration are updated.api_key
: The VectorShift API key.
sync
A method to sync the Knowledge Base object with the VectorShift platform (such that the object reflects the most up-to-date version of the Knowledge Base from the platform). The KnowledgeBase
object should already have an ID.
Arguments:
api_key
: The VectorShift API key.
load_documents
A method to add/load a new document into the Knowledge Base, or add files associated with an integration. The KnowledgeBase
object should already have an ID.
Arguments:
document
: The document to load into the Knowledge Base. It should correspond with the value ofdocument_type
; see below.document_name
: The name of the document.document_type
: The type of document provided. It should be one of the following options:"File"
: Loads a file.document
should be the path to a file."Integration"
: Represents an integration. All files from the integration are loaded.document
should be JSON data for the integration.
Otherwise,
document
is treated as text.chunk_size
: The maximum size of vectors that this document will be split into (if the document must be split into multiple vectors)chunk_overlap
: The striding of vectors that this document will be split into (if the document must be split into multiple vectors).selected_items
: Used whendocument_type
is"Integration"
. Lists the names of the specific files associated with the integration to update.select_all_items_flag
: Used whendocument_type
is"Integration"
. If this flag isTrue
, all files associated with the integration are updated.metadata
: General metadata to be added to (each of) the new document(s),metadata_by_item
: Used to add metadata to specific documents. Should be a dictionary of file names to document-specific metadata.api_key
: The VectorShift API key.
query
A method to query the Knowledge Base for specific documents. Returns a JSON response with the documents.
Arguments:
query
: A string query to the Knowledge Base.max_docs
: The maximum number of documents to be returned from the query.filter
: Additional filters. Only documents whose metadata contains the key-value pairs specified infilter
will be returned.rerank
: Whether or not to rerank documents upon retrieval.api_key
: The VectorShift API key.
list_documents
A method to list all existing documents in the Knowledge Base. Returns a JSON representation of the documents.
Arguments:
max_docs
: The maximum number of documents to be returned.api_key
: The VectorShift API key.
delete_documents
A method to delete a document by ID from the Knowledge Base. We are currently in the process of building out this functionality.
Arguments:
document_id
: The ID of the document to delete. For now, this should be a singleton list containing the ID.filter
: Forthcoming filter for documents to delete. Currently has no usage.api_key
: The VectorShift API key.
share
A method to share a Knowledge Base object with one or more emails.
Arguments:
shared_users
: A list of emails to share the Knowledge Base with.api_key
: The VectorShift API key.
fetch_shared
A method that returns a list of all emails with which the Knowledge Base is shared.
Arguments:
api_key
: The VectorShift API key.
remove_share
A method to remove sharing from one or more emails.
Arguments:
shared_users
: A list of emails from which to remove sharing.api_key
: The VectorShift API key.
Last updated