Vector Stores

Interact with Vector Stores through Python classes.

Vector Stores are a type of database/storage system offered by the VectorShift platform that allow you to store various kinds of data, such as text, files, and scraped URLs, into (one or more) vector embeddings that represent the meaning of the shared data. We suggest reading the platform documentation on Vector Stores to gain an appropriate context.

The SDK offers an interface atop the API endpoints through a class to easily fetch, manipulate, and save Vector Stores. Since some methods interface with the VectorShift platform, they require API keys. If the API keys have already been set as environment variables, they do not need to be supplied in those methods. The output of all methods invoking APIs is a dictionary representing the API JSON response.

vectorshift.vectorstore.VectorStore(
    name: str,
    description: str = '',
    chunk_size: int = 400,
    chunk_overlap: int = 0,
    id: str = None,
)

Represents a Vector Store object that may be existing (if an ID is given) or new (if no ID is given). Note: To work with existing Vector Store objects, we suggest using the fetch method instead.

Arguments:

  • name: The name of the Vector Store.

  • description: A brief description of the Vector Store.

  • chunk_size: The default maximum size of vectors stored in the Vector Store.

  • chunk_overlap: The default striding of documents when stored in the Vector Store. If an object cannot be stored as a single vector in the Vector Store, it will be broken up into several vectors. The overlap determines how the object is broken up. For instance, if the total data size is 1000 tokens, a size and overlap of 500 and 0 will give 2 documents (tokens 1-500, 501-1000), while an overlap of 250 gives 3 (tokens 1-500, 250-750, 501-1000).

  • id:The ID of the Vector Store, which, if given, should correspond with a Vector Store you already own on the VectorShift platform. If blank, represents a new Vector Store.

vectorshift.vectorstore.VectorStore.fetch(
    vectorstore_name: str,
    vectorstore_id: str,
    username: str,
    org_name: str,
    api_key: str = None,
)

A static method that creates a VectorStore object representing an existing Vector Store on the VectorShift platform, given an ID or name.

Arguments:

  • vectorstore_name: The name of the Vector Store being represented.

  • vectorstore_id: The ID of the Vector Store being represented. At least one of vectorstore_id and vectorstore_name should be provided. If both are provided, vectorstore_id is used to search for the Vector Store.

  • username: The username of the user owning the Vector Store.

  • org_name: The organization name of the user owning the Vector Store, if applicable.

  • api_key: The VectorShift API key.

save(update_existing: bool = False, api_key: str = None) 

A method to save or update a VectorStore object to the VectorShift platform.

Arguments:

  • update_existing: Whether or not to save the Vector Store as a new object or replace an existing one. If set to True, the VectorStore should have an ID, and the existing Vector Store with the ID will be replaced with the object's data. If set to False, the ID, if any, is ignored and a new Vector Store object is created with the object's data.

  • api_key: The VectorShift API key.

update_metadata(
    list_of_item_ids: list[str],
    list_of_metadata: list[str],
    keep_prev: bool,
    api_key: str = None,
) 

A method to update metadata fields for items in a Vector Store. The VectorStoreobject should already have an ID.

Arguments:

  • list_of_item_ids: The IDs of the items whose metadata is to be updated.

  • list_of_metadata: The new metadata for all items. Should have the same length as list_of_item_ids. For each i, the ith element of list_of_metadata will be the new metadata for the item identified by the ith element of list_of_item_ids.

  • keep_prev: Whether or not to replace or update the existing metadata for each item. If set to True, additional metadata is added to existing metadata. If set to False, the old metadata is discarded.

  • api_key: The VectorShift API key.

update_selected_files(
    integration_id: str,
    keep_prev: bool,
    selected_items: list[str] = None,
    select_all_items_flag: bool = True,
    api_key: str = None,
)

A method to update files associated with an integration. The VectorStoreobject should already have an ID.

Arguments:

  • integration_id: The ID of the specific integration with associated files in the Vector Store. Only files associated with this integration will be updated.

  • keep_prev: Whether or not to keep the previous versions of the files. If set to True, additional files are added separately.

  • selected_items: Names of the specific files to update.

  • select_all_items_flag: If this flag is True, all files associated with the integration are updated.

  • api_key: The VectorShift API key.

sync(
    api_key: str = None,
)

A method to sync the Vector Store object with the VectorShift platform (such that the object reflects the most up-to-date version of the Vector Store from the platform). The VectorStoreobject should already have an ID.

Arguments:

  • api_key: The VectorShift API key.

load_documents(
    document,
    document_name: str = None,
    document_type: str = 'File',
    chunk_size: int = None,
    chunk_overlap: int = None,
    selected_items: list = None,
    select_all_items_flags: list = None,
    metadata: dict = None,
    metadata_by_item: dict = None,
    api_key: str = None,
)

A method to load a new document into the Vector Store, or add files associated with an integration. The VectorStoreobject should already have an ID.

Arguments:

  • document: The document to load into the Vector Store. It should correspond with the value of document_type; see below.

  • document_name: The name of the document.

  • document_type: The type of document provided. It should be one of the following options:

    • File: Loads a file. document should be the path to a file.

    • Integration: Represents an integration. All files from the integration are loaded. document should be JSON data for the integration.

    Otherwise, document is treated as text.

  • chunk_size: The maximum size of vectors that this document will be split into (if the document must be split into multiple vectors)

  • chunk_overlap: The striding of vectors that this document will be split into (if the document must be split into multiple vectors).

  • selected_items: Used when document_type is Integration. Lists the names of the specific files associated with the integration to update.

  • select_all_items_flag: Used when document_type is Integration. If this flag is True, all files associated with the integration are updated.

  • metadata: General metadata to be added to (each of) the new document(s),

  • metadata_by_item: Used to add metadata to specific documents. Should be a dictionary of file names to document-specific metadata.

  • api_key: The VectorShift API key.

query(
    query: str, 
    max_docs: int = 5, 
    filter: dict = None, 
    rerank: bool = False, 
    api_key: str = None, 
)

A method to query the Vector Store for specific documents. Returns a JSON response with the documents.

Arguments:

  • query: A string query to the Vector Store.

  • max_docs: The maximum number of documents to be returned from the query.

  • filter: Additional filters. Only documents whose metadata contains the key-value pairs specified in filter will be returned.

  • rerank: Whether or not to rerank documents upon retrieval.

  • api_key: The VectorShift API key.

list_documents(
    max_documents: int = 5,
    api_key: str = None, 
)

A method to list all existing documents in the Vector Store. Returns a JSON representation of the documents.

Arguments:

  • max_docs: The maximum number of documents to be returned.

  • api_key: The VectorShift API key.

delete_documents(
    document_id: list[str],
    filter: dict = None,
    api_key: str = None, 
)

A method to delete a document by ID from the Vector Store. We are currently in the process of building out this functionality.

Arguments:

  • document_id: The ID of the document to delete. For now, this should be a singleton list containing the ID.

  • filter: Forthcoming filter for documents to delete. Currently has no usage.

  • api_key: The VectorShift API key.

share(
    shared_users: list[str], 
    api_key: str = None, 
)

A method to share a Vector Store object with one or more emails.

Arguments:

  • shared_users: A list of emails to share the Vector Store with.

  • api_key: The VectorShift API key.

fetch_shared(
    api_key: str = None, 
)

A method that returns a list of all emails with which the Vector Store is shared.

Arguments:

  • api_key: The VectorShift API key.

remove_share(
    users_to_remove: list[str],
    api_key: str = None, 
)

A method to remove sharing from one or more emails.

Arguments:

  • shared_users: A list of emails from which to remove sharing.

  • api_key: The VectorShift API key.

Last updated