pipeline.add(name="...").<node>(...). Each entry lists the node’s configuration parameters. See the Pipeline reference for add, run, and lifecycle methods.
chunking
Split text into chunks. Supports different chunking strategies like markdown-aware, sentence-based, or dynamic sizing.
Strategy for grouping segmented text into final chunks. ‘sentence’: groups sentences; ‘markdown’: respects Markdown structure (headers, code); ‘dynamic’: optimizes breaks for size using chosen segmentation method.
One of:
dynamic, markdown, sentenceThe text to chunk
The overlap of each chunk of text.
The size of each chunk of text.
The method to break text into units before chunking. ‘words’: splits by word; ‘sentences’: splits by sentence boundary; ‘paragraphs’: splits by blank line/paragraph.
One of:
paragraphs, sentences, wordscreate_workspace
Create a new workspace in a portal, upload files to its knowledge base, and share with users
knowledge_base
Semantically query a knowledge base that can contain files, scraped URLs, and data from synced integrations (e.g., Google Drive).
Use additional LLM calls to analyze each document to improve answer correctness
Filter the content returned from the knowledge base. Agents should provide structured metadata filters directly in the filter input when useful.
Enable context
Format the context for the LLM
Enable the document DB filter
Generate an LLM response from the retrieved context
Whether to stream the LLM response
The query will be used to search documents for relevant content semantically. Must not be empty, only include relevant information for retrieval or metadata filter generation. Generally expand any specific acronyms or abbreviations but include the original acronym or abbreviation as well
Additional context to pass to the query analysis and qa steps
Filter the documents returned from the knowledge base
Structured metadata filter JSON for the knowledge base query. Use a top-level boolean clause such as {“type”:“condition”,“field”:“title”,“operator”:“match”,“value”:“Q4 report”}; leave empty when no hard metadata constraint is needed.
Use an LLM to generate metadata filters to refine your query. Agents should usually leave this false and provide filters directly in the filter input.
Select an existing knowledge base, Use $object.knowledge_base.? syntax
The system prompt to use for the LLM
The alpha value for the retrieval. 1.0 is pure vector search and 0.0 is pure lexical search
Extract separate questions from the query and retrieve content separately for each question to improve search performance
Do a natural language metadata query
Expand query to improve semantic search
Expand query terms to improve semantic search
The number of chunks to rerank
Rerank the documents returned from the knowledge base
Refine the initial ranking of returned chunks based on relevancy
The unit of retrieval. Chunks will return the most relevant chunks from the knowledge base as well as their text content. Documents will return the document metadata as well as most relevant snippets from the document. Pages will return complete pages with all chunks from pages containing relevant content
One of:
chunks, documents, pagesThe score cutoff
The number of relevant chunks to be returned
Transform the query for better semantic search
The mode to use for the advanced search
One of:
accurate, fastThe model to use for the QA
knowledge_base_actions
Create, load, and sync Knowledge Bases
knowledge_base_agent
Query a knowledge base using an agentic approach with tools.
Platform docs: Query a knowledge base using an agentic approach with tools.
Select the LLM provider to be used by the agent
One of:
googleControls the query effort: ‘fast’ for quick answers, ‘focused’ for balanced depth, ‘deep’ for thorough analysis
One of:
deep, fast, focusedIf enabled, shows an additional context input to provide context to the agent
If enabled, returns the relevant context/chunks used to generate the answer
Select the LLM model to be used by the agent
One of:
gemini-3-flash-previewThe natural language query. The agent will use this to determine the best way to query the knowledge base. Include the key criteria needed to answer the query.
Optional additional context to help the agent understand the query better (e.g., conversation history, user preferences)
Select an existing knowledge base. You must provide the id in $.object.knowledge_base.id format
If enabled, generates a synthesized answer from the knowledge base
knowledge_base_create
Dynamically create a Knowledge Base with configured options
Platform docs: Dynamically create a Knowledge Base with configured options
Strategy for grouping segmented text into final chunks. ‘sentence’: groups sentences; ‘markdown’: respects Markdown structure (headers, code); ‘dynamic’: optimizes breaks for size using chosen segmentation method.
One of:
advancedTo analyze document contents and enrich them when parsing
Apify API Key for scraping URLs (optional)
The overlap of the chunks to store in the knowledge base
The size of the chunks to store in the knowledge base
The name of the collection to store the knowledge base in
The embedding model to use for the knowledge base. Format: provider/model
The embedding provider to use
The file processing implementation to use for parsing documents
One of:
contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textractWhether to create a hybrid knowledge base
The name of the knowledge base to create
The precision to use for the knowledge base
The method to break text into units before chunking. ‘words’: splits by word; ‘sentences’: splits by sentence boundary; ‘paragraphs’: splits by blank line/paragraph.
One of:
paragraphs, sentences, wordsWhether to shard the knowledge base
The vector database provider to use
knowledge_base_fetch_document_content
Fetch the full content of a specific document from a knowledge base by scrolling through all its chunks
knowledge_base_fetch_items
Advanced knowledge base item fetching with traversal, filtering, and output shaping capabilities
One of:
ALL, DOCUMENTS, FOLDERSOne of:
long, metadata, shortknowledge_base_get_item_bboxes
Fetch OCR bounding boxes for specific pages of a PDF document in a knowledge base
knowledge_base_list_items
List items (documents and folders) from a knowledge base with pagination support
knowledge_base_loader
Load data into an existing knowledge base.
Platform docs: Load data into an existing knowledge base.
Select the type of data to load
One of:
File, URLScrape sub-pages of the provided link
The knowledge base to load data into
The frequency to rescrape the URL
One of:
Daily, Monthly, Never, WeeklyThe raw URL link (e.g., https://vectorshift.ai/)
Use a proxy to crawl the website
Load URLs to crawl from a sitemap. If the URL is a sitemap, it will be used directly. If the URL is not a sitemap, the sitemap will be fetched automatically.
The maximum depth of the URL to crawl
The maximum number of recursive URLs to scrape
Whether to only crawl links from the same domain
The file to be added to the selected knowledge base. Note: to convert text to file, use the Text to File node
knowledge_base_sync
Automatically trigger a sync to the integrations in the selected knowledge base
semantic_search
Generate a temporary vector database at run-time and retrieve the most relevant pieces from the documents based on the query.
Use additional LLM calls to analyze each document to improve answer correctness
Filter the content returned from the knowledge base
Additional context passed to advanced search and query analysis
Format the context for the LLM
Filter the documents returned from the knowledge base
Strategy for grouping segmented text into final chunks. ‘sentence’: groups sentences; ‘markdown’: respects Markdown structure (headers, code); ‘dynamic’: optimizes breaks for size using chosen segmentation method.
One of:
dynamic, markdown, sentenceThe model to use for the embedding
The query will be used to search documents for relevant pieces semantically.
To analyze document contents and enrich them when parsing
Additional context to pass to the query analysis and qa steps
Filter the documents returned from the knowledge base
The text for semantic search. Note: you may add multiple upstream nodes to this field.
Filter the content returned from the knowledge base
Whether to create a hybrid knowledge base
The method to break text into units before chunking. ‘words’: splits by word; ‘sentences’: splits by sentence boundary; ‘paragraphs’: splits by blank line/paragraph.
Show intermediate steps
The alpha value for the retrieval
Extract separate questions from the query and retrieve content separately for each question to improve search performance
Do a natural language metadata query
Expand query to improve semantic search
Expand query terms to improve semantic search
The maximum number of relevant chunks to be returned
Refine the initial ranking of returned chunks based on relevancy
Refine the initial ranking of returned chunks based on relevancy
The unit of retrieval. Chunks will return the most relevant chunks, Documents will return document metadata with snippets, and Pages will return complete pages with all chunks from pages containing relevant content
One of:
chunks, documents, pagesThe score cutoff
Transform the query for better semantic search
The mode to use for the advanced search
One of:
accurate, fastThe model to use for the QA
