TheDocumentation Index
Fetch the complete documentation index at: https://docs.vectorshift.ai/llms.txt
Use this file to discover all available pages before exploring further.
KnowledgeBase class is the SDK surface for VectorShift’s managed retrieval store. Ingest files, URLs, folders, tables, or third-party integrations; query them with vector + keyword search, hybrid fusion, rerank, and optional QA — directly from Python, or via an Agent tool / a Pipeline node.
Prerequisites: Installed SDK · API key set · Python 3.10+.
Mental model
- A KB is a named collection of items (files, URLs, table rows, integration records). Each item is chunked, embedded, and indexed once on ingest.
- Ingestion is task-based: every
add_files/add_urls/add_folder/add_tablescall returns anIngestionTaskyou can poll, or use the_and_waitvariant which blocks untilCOMPLETED. - Querying is a single surface —
kb.query("text", top_k=…, filters=…, hybrid=…, rerank=…, qa=…). Pass kwargs or a singleQueryConfig, never both. Returns a typedQueryResult. - Every method has an async variant (
anew,aadd_files,aquery,ascroll, …).
Quick start
How to use a Knowledge Base
| Direct query | Via an Agent | Via a Pipeline | |
|---|---|---|---|
| Surface | kb.query("text", …) | AgentTools.knowledge_base(id=kb.id, …) on Agent.new(tools=[…]) | KnowledgeBaseNode(knowledge_base=kb, …) inside Pipeline.new(...) |
| Output | Typed QueryResult — .chunks, .answer, .citations | Streamed MESSAGE_DELTA events with <vs-cite> tags inline | A pipeline node output (.formatted_text, etc.) you wire into downstream nodes |
| Use when | Building your own retrieval logic; offline scoring; smoke-testing the KB | Conversational RAG — the model decides when to retrieve, emits citations, supports multi-turn memory | Deterministic graphs where retrieval is one fixed step (RAG-pipeline, chatbots, batch jobs) |
| Guide | This page’s Quick start | RAG end-to-end | rag-pipeline example |
Ingestion sources
| Method | What it ingests | Wait helper |
|---|---|---|
add_files | Local files (Path or bytes-likes) | add_files_and_wait |
add_urls | URLs, with optional recursive crawl and rescrape schedule via UrlConfig | add_urls_and_wait |
add_folder | An entire directory tree | add_folder_and_wait |
add_tables | Structured table rows | add_tables_and_wait |
| Integrations (Slack, Google Drive, …) | Configured in the platform, then resync_integration refreshes them | — |
IngestionTask with .task_id, .status, .item_ids, and (on failure) .error / .failed_uploads. The _and_wait variants poll until terminal status; the bare ones return immediately and let you poll via ingestion_status(task_id).
Recent additions
The KB surface was overhauled: ingestion is now task-based (add_files / add_urls / add_folder / add_tables + _and_wait variants), kb.query(...) returns a typed QueryResult with .chunks / .citations / .answer, and querying takes either kwargs (top_k, filters, hybrid, rerank, qa) or a single QueryConfig. Items can be enumerated and filtered via list_items / scroll and re-organised via create_folder / move_items / update_item_metadata.
What’s next
Reference
Every public method, grouped by topic.
RAG end-to-end guide
Wrap a KB as a tool on a conversational Agent.
RAG pipeline example
Compose a KB reader into a Pipeline.
