Node Classes
The SDK maintains a close correspondence between no-code pipeline nodes and Python classes.
endpointsThis page contains documentation for all node classes available in the SDK, which match up closely with the nodes available in the no-code pipeline builder. The organization of sections below loosely follows how different nodes are organized into tabs in the no-code editor.
All nodes are classes initialized with parameters depending on the node. We document constructor arguments and outputs. Arguments are listed under input and parameter sections. Inputs denote arguments that are NodeOutput
s from earlier nodes passed in, while parameters denote arguments that modify the other properties of the node. Outputs are listed by output name. For instance, if a node n
has an output called output_name
, then the documentation provides details for output_name
, and the way to access this output in Python would be via n.outputs()["output_name"]
.
While we provide setters to modify specific parameters and inputs of nodes, we do not currently have individual getter methods. However, each node class comes with built-in methods to display attributes, which are displayed when the node is printed. Each node also has a construction_strs()
method that may be called to return a list of the arguments that can replicate the node using its constructor.
Inputs and Outputs
InputNode
InputNode
Represents the inputs (start points) to a pipeline. Your pipelines should always start with these.
Inputs:
None. This node represents what is passed into the pipeline when it is run.
Parameters:
name
: A string representing the input name, e.g."
text_input
"
. Should only contain alphanumeric characters and underscores.input_type
: A string representing the input type. Each input type corresponds with a specific data type for the outputs of the node. The string must be one of the following, and an error is thrown otherwise:"
text
"
: The input to the pipeline should be (one or more pieces of) text. Corresponds to theText
data type (List[Text]
for multiple inputs)."
file
"
: The input to the pipeline should be one or more files. Corresponds to theFile
data type (List[File]
for multiple inputs).
process_files
: Ifinput_type
is"
file
"
, sets whether or not to automatically process the files into text. (If set toTrue
, this node essentially also includes the functionality ofFileLoaderNode
.) Ignored ifinput_type
is not"
file
"
.
Outputs:
value
: TheNodeOutput
representing the pipeline's input. The output data type is specified by theinput_type
parameter above.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for the node parameters.
OutputNode
OutputNode
Represents the outputs (endpoints) to a pipeline. Your pipelines should always end with these.
Inputs:
input
: TheNodeOutput
to be used as the pipeline output, whose data type should matchinput_type
.
Parameters:
name
: A string representing the name of the pipeline's overall output, e.g."
text_output
"
. Should only contain alphanumeric characters and underscores.input_type
: A string representing the input type. Each input type corresponds with a specific output data type for the outputs of the node. The string must be one of the following and an error is thrown otherwise:"
text
"
: The input to the pipeline should be (one or more pieces of) text. Corresponds to theText
data type (List[Text]
for multiple inputs)."
formatted_text
"
: The input to the pipeline should be (one or more pieces of) text. Same astext
, but formatted as Markdown."
file
"
: The input to the pipeline should be one or more files. Corresponds to theFile
data type (List[File]
for multiple inputs)."image"
:
Outputs:
None. This node represents what the pipeline produces when it is run.
Setters for the node parameters and inputs.
Text and File Data
TextNode
TextNode
Represents a block of text. The text may include text variables, which are placeholders for text produced earlier on in the pipeline expected to be supplied as additional inputs, and notated within a text block using double curly brackets {{}}
. For instance, the text block
would expect one text variable, response
. When the pipeline is run, the earlier output is substituted into the place of {{response}}
to create the actual text.
Inputs:
text_inputs
: A map of text variable names toNodeOutput
s expected to produce the text for the variables. EachNodeOutput
should have data typeText
.text_inputs
may contain a superset of the variables intext
. However, each text variable intext
should be included as a key intext_inputs
. When the pipeline is run, eachNodeOutput
's contents are interpreted as text and substituted into the variable's places. Iftext
contains no text variables, this can be empty.
Parameters:
text
: The string representing the text block, wrapping text variables with double brackets. The same variable can be used in more than one place.format_text
: A flag for whether or not to auto-format text.
Outputs:
output
: TheNodeOutput
representing the text, of data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for the text and format_text
flag. If the new text in set_text
contains text variables, they must already be in the text inputs of the node.
Methods to set and remove text inputs. Variables added via set_text_input
do not necessarily have to be in the current text. However, the variables removed via remove_text_input
cannot be in the text.
FileNode
FileNode
Represents one or more files in a pipeline. Files should already be stored within the VectorShift platform. An API call is made upon initialization to retrieve relevant file data, so an API key is required.
Inputs:
None. This node expects to retrieve files via an API call to the VectorShift platform.
Parameters:
file_names
: A list of file names stored on the VectorShift platform to be loaded by this node.process_files
: Whether or not to automatically process the files into text. (If set toTrue
, this node essentially also includes the functionality ofFileLoaderNode
.)chunk_size
,chunk_overlap
: How files should be loaded ifprocess_files
isTrue
. Resulting strings will be of length at mostchunk_size
and overlap withchunk_overlap.
api_key
: The VectorShift API key to make calls to retrieve the file data.
Outputs:
files
: TheNodeOutput
representing the files, of data typeList[File]
ifprocess_files
is set toFalse
, andList[Document]
otherwise.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for the node parameters.
StickyNoteNode
StickyNoteNode
A sticky note with no functionality.
Inputs:
None.
Parameters:
text
: The text in the sticky note.
Outputs:
None.
Setter for the sticky note text.
FileSaveNode
FileSaveNode
Represent the saving of one or more files to the VectorShift platform.
Inputs:
name_input
: ANodeOutput
representing the name under which the file should be saved. The output of aTextNode
can be used if the desired file name is known and fixed. Should have data typeString
.files_input
: One or moreNodeOutput
s representing files to be saved. They should have output data typeFile
.
Parameters:
None.
Outputs:
None. This node represents saving files.
Setters for the node inputs.
User-Created VectorShift Objects
PipelineNode
PipelineNode
Represent a nested Pipeline, which will be run as a part of the overall Pipeline. When the node is executed, the pipeline it represents is executed with the supplied inputs, and the overall pipeline's output becomes the node's output. The Pipeline must already exist on the VectorShift platform, so that it can be referenced by its ID or name. If the ID or name are not provided, the node represents a generic nested Pipeline whose details must be provided before it is run. If an ID or name are provided, an API call is made upon initialization to retrieve relevant Pipeline data, meaning an API key is required.
It is also possible to construct PipelineNode
s from pipeline objects. See the method from_pipeline_obj
below.
Inputs:
inputs
: A map of input names toNodeOutput
s, which depends on the specific Pipeline. In essence, theNodeOutput
s passed in are interpreted as inputs to the Pipeline represented by thePipelineNode
. They should match up with the expected input names of the pipeline. For instance, if the Pipeline has input namesinput_1
andinput_2
, then the dictionary should contain those strings as keys.
Parameters:
pipeline_id
: The ID of the Pipeline being represented.pipeline_name
: The name of the Pipeline being represented. At least one ofpipeline_id
andpipeline_name
should be provided. If both are provided,pipeline_id
is used to search for the Pipeline. If both are omitted, a generic Pipeline node will be saved, and details must be provided before the pipeline including the node is run.username
: The username of the user owning the Pipeline.org_name
: The organization name of the user owning the Pipeline, if applicable.batch_mode
: A flag to set whether or not the pipeline can run batched inputs.api_key
: The VectorShift API key to make calls to retrieve the Pipeline data.
Outputs:
Outputs are determined from the pipeline represented. Since each pipeline returns one or more named outputs that are either of File
or Text
data type, the keys of the outputs dictionary are the named outputs of the pipeline, with the values given the appropriate data type.
A static method to construct a pipeline node from a pipeline object, to avoid the manual action on part of the programmer of saving the pipeline object. The pipeline will automatically be saved to the VectorShift platform when the method is run.
Arguments:
pipeline_obj
: The pipeline object to be represented by the node.inputs
: A map of expected pipeline input names to NodeOutputs. As above, the map keys should match the expected input names of the pipeline object.api_key
: The API key to be used when saving the pipeline to the VectorShift platform.
Setters for the node's parameters and inputs.
AgentNode
AgentNode
Represent an agent. The agent must already exist on the VectorShift platform, so that it can be referenced by its ID or name. An API call is made upon initialization to retrieve relevant agent data, meaning an API key is required.
It is also possible to construct AgentNode
s from agent objects. See the method from_agent_obj
below.
Inputs:
inputs
: A map of input names toNodeOutput
s, which depends on the specific agent. In essence, theNodeOutput
s passed in are interpreted as inputs to the pipeline represented by theAgentNode
. They should match up with the expected input names of the agent. For instance, if the agent has input namesinput_1
andinput_2
, then the dictionary should contain those strings as keys.
Parameters:
agent_id
: The ID of the agent being represented.agent_name
: The name of the agent being represented. At least one ofagent_id
and agent_name
should be provided. If both are provided,agent_id
is used to search for the agent object.username
: The username of the user owning the agent.org_name
: The organization name of the user owning the agent, if applicable.api_key
: The VectorShift API key to make calls to retrieve the agent data.
Outputs:
Outputs are determined by the agent represented. Since each agent returns one or more named outputs that are either of File
or Text
data type, the keys of the outputs dictionary are the named outputs of the agent, with the values given the appropriate data type.
A static method to construct an agent node from an agent object, to avoid the manual action on the part of the programmer of saving the agent object. The agent will automatically be saved to the VectorShift platform when the method is run.
Arguments:
agent_obj
: The agent object to be represented by the node.inputs
: A map of expected agent input names to NodeOutputs. As above, the map keys should match the expected input names of the agent object.api_key
: The API key to be used when saving the agent to the VectorShift platform.
Setters for the node's parameters and inputs. The node currently does not support changing the agent itself; to do this, a new replacement node should be created.
IntegrationNode
IntegrationNode
Represents a particular action taken from a VectorShift integration (e.g. the "save files" action from a Google Drive integration). The integration should already exist on the VectorShift platform so that it can be referenced by its name. If the integration ID or action is not specified, the node represents a generic, incomplete integration whose details must be provided before it is run. The particular actions available depend on the integration. Some actions may require additional arguments passed into the constructor.
If this node contains information about the specific integration to use, an API call is made when a pipeline containing this node is saved to retrieve relevant integration data, meaning an API key is required.
See below for a list of integrations, actions, and their corresponding expected inputs/outputs.
Inputs:
inputs
: A map of input names to lists ofNodeOutput
s, which depends on the specific integration. (If there is only oneNodeOutput
, a singleton list should be used as the value.) The inputs should match the expected names and data types of the specific integration and function.
Parameters:
integration_type
: A string denoting the type of integration.integration_id
:
The name of the integration ID is represented. If not provided, then the node represents a generic integration that needs to be set up before the pipeline is run.action
: The name of the specific action to be used with the integration. If not provided, the node represents a generic integration whose action should be later specified.api_key
: The API key to be used when retrieving integration data from the VectorShift platform.
Outputs:
Outputs are determined from the specific integration action. They are currently given data type Any
.
Supported Integration Actions and Parameters:
Salesforce
run_sql_query
sql_query
, expected data typeList[Text]
output
, data typeAny
Google Drive
search_files
query
, expected data type Text
output
, data typeAny
read_files
None
output
, data typeAny
save_files
name
, expected data typeText
files
, expected data typeList[File]
None
Gmail
search_emails
query
, expected data type Text
output
, data typeAny
create_draft
, send_email
subject
, expected data typeText
recipients
, expected data typeText
body
, expected data typeText
None
draft_reply
, send_reply
recipients
, expected data typeText
body
, expected data typeText
email_id
, expected data typeText
None
Google Sheets
write_to_sheet
To initialize this integration, the node constructor expects an additional string argument file_id
, an additional string argument sheet_id
, and a string list argument sheet_fields
.
One
NodeOutput
for each field given insheet_fields
, expected data typeText
None
Google Docs
search_docs
query
, expected data type Text
output, data type Any
write_to_doc
doc_name
, expected data typeText
text
, expected data typeText
None
Google Calendar
search_events
query
, expected data type Text
output, data type Any
new_event
calendar_name
, expected data typeText
description
, expected data typeText
None
Notion
write_to_database
To initialize this integration, the node constructor expects an additional string argument database_id
and a string list argument database_fields
.
One
NodeOutput
for each field given indatabase_fields
, expected data typeText
None
AirTable
new_record
To initialize this integration, the node constructor expects an additional string argument base_id
, a string argument table_id
, and a string list argument table_fields
.
One
NodeOutput
for each field given indatabase_fields
, expected data typeText
None
Hubspot
search_contacts
, search_companies
, search_deals
query
, expected data typeText
output
, data typeAny
SugarCRM
get_records
module
, expected data typeText
filter
, expected data typeText
output
, data typeAny
Linear
create_issue
title
, expected data typeText
team_name
, expected data typeText
description
, expected data typeText
None
create_comment
issue_name
, expected data typeText
comment
, expected data typeText
None
Slack
send_message
channel_name
, expected data typeText
message
, expected data typeText
None
search_messages
query
, expected data typeText
output
, data typeAny
Discord
send_message
channel_name
, expected data typeText
message
, expected data typeText
None
search_messages
query
, expected data typeText
output
, data typeAny
Copper
search
query
, expected data typeText
output
, data typeAny
create_lead
name
, expected data typeText
email
, expected data typeText
None
Setters for the node's parameters and inputs.
Specific Integration Nodes
Akin to an IntegrationNode
with integration_type
= '
Salesforce
'
.
Akin to an IntegrationNode
with integration_type
= '
Google Drive
'
.
Akin to an IntegrationNode
with integration_type
= '
Gmail
'
.
Akin to an IntegrationNode
with integration_type
= '
Google Sheets
'
.
Akin to an IntegrationNode
with integration_type
= '
Google Docs
'
.
Akin to an IntegrationNode
with integration_type
= '
Google Calendar
'
.
Akin to an IntegrationNode
with integration_type
= '
Notion
'
.
Akin to an IntegrationNode
with integration_type
= '
Airtable
'
.
Akin to an IntegrationNode
with integration_type
= '
HubSpot
'
.
Akin to an IntegrationNode
with integration_type
= '
SugarCRM
'
.
Akin to an IntegrationNode
with integration_type
= '
Linear
'
.
Akin to an IntegrationNode
with integration_type
= '
Slack
'
.
Akin to an IntegrationNode
with integration_type
= '
Discord
'
.
Akin to an IntegrationNode
with integration_type
= '
Copper
'
.
TransformationNode
TransformationNode
Represent a user-created transformation. The transformation must already exist on the VectorShift platform, so that it can be referenced by its name. An API call is made upon initialization to retrieve relevant transformation data, meaning an API key is required.
Inputs:
inputs
: A map of input names to strings ofNodeOutput
s, which depends on the specific transformation. The inputs should match the expected names and data types of the specific integration and function. There are currently no checks on the input, so it is up to your discretion to ensure that theNodeOutput
s you provide to the transformation node are compatible with the transformation.
Parameters:
transformation_name
: The name of the user-created transformation being represented. Must be provided.api_key
: The API key to be used when retrieving information about the transformation from the VectorShift platform.
Outputs:
Outputs are determined from the specific transformation. They are currently given data type Any
.
Setters for the node's parameters and inputs. The node currently does not support changing the transformation itself; to do this, a new replacement node should be created.
Models
OpenAILLMNode
OpenAILLMNode
Represents an OpenAI LLM. These models take in two main inputs: one "system" input that describes the background or context for generating the text, and one input for the prompt itself. For instance, the system input could be used for telling the model that it is an assistant for a specific task, and the prompt input could be a task-related question.
Optionally, text variables can be inserted into the system and prompt in an analogous manner to TextNode
.
Inputs:
system_input
: The output corresponding to the system prompt. Should have data typeText
. Can also be a string.prompt_input
: The output corresponding to the prompt. Should have data typeText
. Can also be a string.text_inputs
: A map of text variable names toNodeOutput
s expected to produce the text for the system and prompt, if they are strings containing text variables. EachNodeOutput
should have data typeText
. Each text variable insystem_input
andprompt_input
, if they are strings, should be included as a key intext_inputs
. When the pipeline is run, eachNodeOutput
's contents are interpreted as text and substituted into the variable's places.
Parameters:
model
: The specific OpenAI model to use. We currently support the following models:gpt-3.5-turbo
supporting up to 4096 tokensgpt-3.5-turbo-instruct
supporting up to 4096 tokensgpt-3.5-turbo-16k
supporting up to 16384 tokensgpt-4
supporting up to 8192 tokensgpt-4-32k
supporting up to 32768 tokensgpt-4-turbo
supporting up to 128000 tokensgpt-4-turbo-preview
supporting up to 128000 tokens
max_tokens
: How many tokens the model should generate at most. Note that the number of tokens in the provided system and prompt are included in this number. They should be below the model-specific constraints listed above.temperature
: The temperature used by the model for text generation. Higher temperatures generate more diverse but possibly irregular text.top_p
: If top-p sampling is used, controls the threshold probability. Under standard text generation, only the most probable next token is used to generate text; under top-p sampling, the choice is made randomly among all tokens (if they exist) with predicted probability greater than the provided parameterp
. Should be between 0 and 1.stream_response
: A flag setting whether or not to return the model output as a stream or one response.json_response
: A flag setting whether or not to return the model output in JSON format.personal_api_key
: An optional parameter to provide if you have a personal OpenAI account and wish to use your API key.
Outputs:
response
: The generated text, with data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters (and removers) for model parameters and inputs. The function of setters for text inputs is analogous to those for TextNode
.
PromptLLMNode
PromptLLMNode
A general class for LLMs that takes in a single text prompt input (unlike OpenAILLMNode
, which expects two inputs). Optionally, text variables can be inserted into the prompt input analogously to TextNode
.
We categorize LLMs into families (e.g. by Anthropic, Meta, etc.), denoted by the llm_family
parameter. Each family comes with different models. Each specific model has its max_tokens
limit. See below for a list of model families, offered models, and their corresponding token limits.
Inputs:
prompt_input
: The output corresponding to the prompt. Should have data typeText
. Can also be a string.text_inputs
: A map of text variable names toNodeOutput
s expected to produce the text for the variables. EachNodeOutput
should have the data type text
. Each text variable intext
should be included as a key intext_inputs
. When the pipeline is run, theNodeOutput
's contents are interpreted as text and substituted into the variable's places. Iftext
contains no text variables, this can be empty.
Parameters:
llm_family
: The overall family of LLMs to use.model
: The specific model within the family of models to use.max_tokens
: How many tokens the model should generate at most. Note that the number of tokens in the provided system and prompt are included in this number. They should be below the model-specific limits listed below.temperature
: The temperature used by the model for text generation. Higher temperatures generate more diverse but possibly irregular text.top_p
: If top-p sampling is used, control the threshold probability. Under standard text generation, only the most probable next token is used to generate text; under top-p sampling, the choice is made randomly among all tokens (if they exist) with a predicted probability greater than the provided parameterp
. Should be between 0 and 1.
Outputs:
response
: The generated text, with data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Supported LLMs
anthropic
claude-v2
100000
claude-instant
100000
claude-v2.1
200000
cohere
command
4000
command-light
4000
aws
titan-text-express
8000
titan-text-lite
8000
meta
llama2-chat-13b
4096
llama2-chat-70b
4096
llama2-13b
4096
llama2-70b
4096
open_source
mistralai/Mistral-7B-v0.1
4096
mistralai/Mistral-7B-Instruct-v0.1
4096
mistralai/Mistral-7B-Instruct-v0.2
32768
mistralai/Mixtral-8x7B-Instruct-v0.1
32768
mistralai/Mixtral-8x7B-v0.1
32768
google
gemini-pro
32760
text-bison
8192
text-bison-32k
32760
text-unicorn
32760
Setters (and removers) for model parameters and inputs. The function of setters for text inputs is analogous to those for TextNode
.
Specific Prompt LLM Nodes
Represents an Anthropic LLM. Akin to a PromptLLMNode
with llm_family
= '
anthropic
'
.
Represents a Cohere LLM. Akin to a PromptLLMNode
with llm_family
= '
cohere
'
.
Represents an AWS (Amazon) LLM. Akin to a PromptLLMNode
with llm_family
= '
aws
'
.
Represents a Meta LLM. Akin to a PromptLLMNode
with llm_family
= '
meta
'
.
Represents an open-source LLM. Akin to a PromptLLMNode
with llm_family
= '
open_source
'
.
Represents a Google LLM. Akin to a PromptLLMNode
with llm_family
= '
google
'
.
ImageGenNode
ImageGenNode
Represents a text-to-image generative model.
Inputs:
prompt_input
: The text prompt for generating the image(s). Should have data typeText
.
Parameters:
model
: The specific text-to-image model used. We currently support the following models:DALL-E 2
: supported image sizes 256, 512, and 1024, can generate 1-5 imagesStable Diffusion XL
: supported image size 512, can generate 1 imageDALL-E 3
: supported image sizes 1024, (1024, 1792) and (1792, 1024), can generate 1 image
image_size
: The size of the image (e.g. if this is set to512
, then512 x 512
images will be generated; if set to a tuple(a, b)
, then ax b
images will be generated). Must be one of the valid sizes for the model as listed above.num_images
: The number of images to generate. Must be one of the valid numbers for the model as listed above.
Outputs:
images
: The generated image(s), with data typeList[ImageFile]
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters (and removers) for model parameters and inputs. The function of setters for text inputs is analogous to those for TextNode
.
SpeechToTextNode
SpeechToTextNode
Represents a speech-to-text generative model.
Inputs:
audio_input
: The audio file to be converted to text. Should have data typeAudioFile
.
Parameters:
model
: The specific speech-to-text model to use. We currently only support the modelOpenAI Whisper
.
Outputs:
output
: The transcribed text, with data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for model parameters and inputs.
ImageToTextNode
ImageToTextNode
Represents an (OpenAI) image-to-text LLM. These models take in three main inputs: a system and prompt input analogous to OpenAILLMNode, and an image input. Text variables can be optionally inserted.
Inputs:
system_input
: The output corresponding to the system prompt. Should have data typeText
. Can also be a string.prompt_input
: The output corresponding to the prompt. Should have data typeText
. Can also be a string.text_inputs
: A map of text variable names toNodeOutput
s expected to produce the text for the system and prompt, if they are strings containing text variables. EachNodeOutput
should have data typeText
. Each text variable insystem_input
andprompt_input
, if they are strings, should be included as a key intext_inputs
. When the pipeline is run, each NodeOutput's contents are interpreted as text and substituted into the variable's places.image_input
: The output corresponding to the image. Should have data typeImageFile
.
Parameters:
model
: The specific OpenAI model to use. We currently only support the modelgpt-4-vision-preview
.max_tokens
: How many tokens the model should generate at most. Note that the number of tokens in the provided system and prompt are included in this number. Should be no larger than 4096.temperature
: The temperature used by the model for text generation. Higher temperatures generate more diverse but possibly irregular text.top_p: If top-p sampling is used, controls the threshold probability. Under standard text generation, only the most probable next token is used to generate text; under top-p sampling, the choice is made randomly among all tokens (if they exist) with predicted probability greater than the provided parameter p. Should be between 0 and 1.
stream_response
: A flag setting whether or not to return the model output as a stream or one response.json_response
: A flag setting whether or not to return the model output in JSON format.personal_api_key
: An optional parameter to provide if you have a personal OpenAI account and wish to use your API key.
Outputs:
response
: The generated text, with data typeText.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters (and removers) for model parameters and inputs. The function of setters for text inputs is analogous to those for TextNode
.
Data Loaders
DataLoaderNode
DataLoaderNode
A general-purpose node representing the retrieval of data from a third-party source. The names and data types of inputs and outputs are dependent on the specific loader (the loader type). Inputs can either be string parameters or NodeOutput
s from earlier nodes. See below for a list of loader types and their corresponding inputs/outputs.
For most data loaders, the output is a list of documents. The optional parameters chunk_size
and chunk_overlap
then determine how those documents are formed, specifying the size and stride in tokens of each document. For instance, if the total data size is 1000 tokens, a size and overlap of 500
and 0
will give 2 documents (tokens 1-500, 501-1000), while an overlap of 250
gives 3 (tokens 1-500, 250-750, 501-1000).
Inputs:
inputs
: A map of input names to lists of either strings orNodeOutput
s, which depends on the specific loader. (If the input field is known, a string can directly be supplied.)
Parameters:
loader_type
: The specific data loader. Should be one of the valid data loader types listed below.chunk_size
: The maximum size of each document in tokens, if the node returns aList[Document]
.chunk_overlap
: The amount of overlap between documents in tokens, if the node returns a[List[Document]
.
Outputs:
output
: The data loaded by the node. The data type depends on the specific loader.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Supported Data Loader Types
File
Load the contents of one or more files.
file
: a list ofNodeOutputs
, which should each have data typeFile
,List[File]
,Text
, orList[Text]
.
List[Document]
CSV Query
Query a CSV and return the results.
query
: A singleton list of a string orNodeOutput
with data typeText
.csv
: A singleton list of aNodeOutput
with data typeCSVFile
.
Text
URL
Retrieve the contents of a website from its URL.
url
: A singleton list of a string orNodeOutput
with data typeURL
.
List[Document]
Wikipedia
Query Wikipedia and return the results.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
YouTube
Retrieve the transcribed contents of the audio of one or more YouTube videos from their URL(s).
url
: A singleton list of a string orNodeOutput
with data typeURL
orList[URL]
.
List[Document]
Arxiv
Query ArXiv and return the results.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
SerpAPI
Use SerpApi to find and return online search results for a given query, given an API key.
apiKey
: A singleton list of a string orNodeOutput
with data typeText
.query
: A singleton list of aNodeOutput
with data typeText
.
Text
Git
Return the contents of a Git repository from its URL.
repo
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
YOU_DOT_COM
Use You.com to search the internet.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
YOU_DOT_COM_NEWS
Use You.com to search for news articles.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
EXA_AI_SEARCH
Use Exa AI to search the internet.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
EXA_AI_SEARCH_COMPANIES
Use Exa AI to search for company information.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
EXA_AI_SEARCH_RESEARCH_PAPERS
Use Exa AI to search for research papers.
query
: A singleton list of a string orNodeOutput
with data typeText
.
List[Document]
Setters for node parameters and inputs.
Specific Data Loader Nodes
Akin to a DataLoaderNode
with loader_type
= '
File
'
and inputs being {'file': files_input}
.
Akin to a DataLoaderNode
with loader_type
= '
CSV Query
'
and inputs being {'query': [query_input], 'csv': [csv_input]}
.
Akin to instantiating a DataLoaderNode
with loader_type
= '
URL
'
and inputs being {'url': [url_input]}
.
Akin to instantiating a WikipediaLoaderNode
with loader_type
= '
Wikipedia
'
and inputs being {'query': [query_input]}
.
Akin to instantiating a DataLoaderNode
with loader_type
= '
YouTube
'
and inputs being {'url': [url_input]}
.
Akin to a DataLoaderNode
with loader_type
= '
Arxiv
'
and inputs being {'query': [query_input]}
.
Akin to a DataLoaderNode
with loader_type
= '
SerpAPI
'
and inputs being {'apiKey': [api_key_input], 'query': [query_input]}
.
Akin to a DataLoaderNode
with loader_type
= '
Git
'
and inputs being {'repo': [repo_input]}
.
ApiLoaderNode
ApiLoaderNode
A node that executes an API call and returns its results. Constructor inputs essentially define the parameters of the API call and should all be strings.
Inputs:
None.
Parameters:
method
: The API method. Should be one of'GET'
,'POST'
,'PUT'
,'DELETE'
, or'PATCH'
.url
: The API endpoint to call.headers
: A list of tuples of strings, representing the headers as key-value pairs.param_type
: The types of API parameters, either'Body'
or'Query'
.params
: A list of tuples of strings, representing the parameters as key-value pairs.
Outputs:
output
: The data returned from the API call, of data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for node attributes.
Search and Knowledge Bases
KnowledgeBaseNode
KnowledgeBaseNode
References a particular permanent Knowledge Base, queries it, and returns the results. The Knowledge Base should already exist on the VectorShift platform, so that it can be referenced by its ID or name. If the ID or name are not provided, the node represents a generic Knowledge Base whose details need to be provided before it is run. If the ID is provided, an API call is made when a pipeline containing this node is saved to retrieve relevant data, meaning an API key is required.
Knowledge Bases are representations of Vector Stores within pipelines; this class is synonymous with a VectorStoreNode
.
However, the VectorStoreNode
name is deprecated.
It is also possible to construct KnowledgeBaseNode
s from Vector Store objects. See the method from_obj
below.
Inputs:
query_input
: The query to the Knowledge Base, which should have data typeText
.
Parameters:
base_id
: The ID of the Knowledge Base being represented.base_name
: The name of the Knowledge Base being represented. At least one ofbase_id
andbase_name
should be provided. If both are provided,base_id
is used to search for the Knowledge Base object. If both are omitted, the node represents a generic Knowledge Base object which must be set up before being run.username
: The username of the user owning the Knowledge Base.org_name
: The organization name of the user owning the Knowledge Base, if applicable.max_docs_per_query
: The maximum number of documents from the Knowledge Base to return from the query.enable_filter
: Flag for whether or not to add a filter to query results.filter_input
: The additional filter for results ifenable_filter
isTrue
. Should be aNodeOutput
of typeText
or a string.rerank_documents
: Flag for whether or not to rerank documents.alpha
: The value of alpha for performing searches (weighting between dense and sparse indices). Ignored if the Knowledge Base is not hybrid.api_key
: The API key to be used when retrieving information about the Knowledge Base from the VectorShift platform.
Outputs:
results
: The documents returned from the Knowledge Base, with data typeList[Document]
.
A static method to construct a Knowledge Base node from a Vector Store object. The Knowledge Base will automatically be saved to the VectorShift platform when the method is run.
Arguments:
obj
: The Knowledge Base to be represented by the node.query_input
: The query to the Knowledge Base, which should have data typeText
.api_key
: The API key to be used when saving the Knowledge Base to the VectorShift platform.
Setters for node parameters and inputs.
VectorDBLoaderNode
VectorDBLoaderNode
Load text documents into a new temporary vector database that can be later queried. Once the pipeline finishes running, the database is deleted.
Note: Deprecated in favor of SemanticSearchNode
.
Inputs:
documents_input
: A list of one or moreNodeOutput
s to be loaded into the vector database. EachNodeOutput
should have data typeText
.
Parameters:
None.
Outputs:
database
: The resulting vector database, with data typeVectorDB
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
VectorDBReaderNode
VectorDBReaderNode
Query a temporary vector database and return its results, similar to a data loader node.
Note: Deprecated in favor of VectorQueryNode
.
Inputs:
query_input
: The query to the vector database, which should have data typeText
.database_input
: The vector database to query, which should have the data typeVectorDB
.max_docs_per_query
: The maximum number of documents from the vector database to return from the query.
Parameters:
None.
Outputs:
results
: The documents returned from the vector database, with data typeList[Document]
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
SemanticSearchNode
SemanticSearchNode
Create a new temporary vector database, load documents into it, run one or more queries, and return the results. Once the pipeline finishes running, the database is deleted. Akin to chaining together a VectorDBLoaderNode
and VectorDBReaderNode
.
This class is synonymous with a VectorQueryNode
.
However, the VectorQueryNode
name is deprecated.
Inputs:
query_input
: The query/queries to the vector database. EachNodeOutput
should have data typeText
.documents_input
: A list of one or moreNodeOutput
s to be loaded into the vector database. EachNodeOutput
should have data typeText
.max_docs_per_query
: The maximum number of documents from the vector database to return from the query.enable_filter
: Flag for whether or not to add an additional filter to query results.filter_input
: The additional filter for results ifenable_filter
isTrue
. Should be aNodeOutput
of typeText
or a string.rerank_documents
: Flag for whether or not to rerank documents.
Outputs:
result
: The documents returned from the vector database, with data typeList[Document]
.
Setters for node parameters and inputs.
Logic
LogicConditionNode
LogicConditionNode
This node allows for simple control flow. It takes in one or more inputs, which are given labels (akin to variable names). It also takes in a list of conditions. Each condition is a tuple of two strings, a predicate that can reference the labels and a resulting label to be outputted by the node if the predicate is True
. The predicate must be a string representing a boolean statement in Python; more information on predicates is located here.
The node has multiple outputs: one output corresponding to each of the conditions, along with an else output. If a predicate evaluates to True
then that condition's output will emit the NodeOutput
whose label is given by the predicate's corresponding label. If a predicate has evaluated to True
, further predicates are not evaluated (i.e. the node only activates the first path that evaluates to True
.) Otherwise, the output is not produced and downstream nodes from that output will not be executed. The outputs are labeled output-0
, output-1
, etc. for each of the conditions, and output_else
.
For example, a simple example of using this condition node would be by composing the following nodes:
where cond_node
has outputs output-0
, output-1
, and output_else
, which will forward the outputs of text1
, text2
, and text3
respectively. output-0
is only emitted if the input is "
hello
"
, and output-1
is only emitted if the input is "
goodbye
"
.
Inputs:
inputs
: A map of output labels toNodeOutput
s. Identifies eachNodeOutput
with a label. Can have any data type.
Parameters:
conditions
: A list of conditions. As explained above, each condition is comprised of a predicate, which should be a string expressing a Python boolean statement, and output label. The predicates are evaluated in order of the list. The first predicate that evaluates toTrue
will return theNodeOutput
identified by the associated label. If no predicates evaluate toTrue
, theNodeOutput
identified byelse_value
is returned.else_value
: The label of theNodeOutput
to emit in the else case.
Outputs:
Outputs named
output-0
,output-1
, ...,output-n
wheren
is one less than the total number of conditions.output-i
equals theNodeOutput
identified by the label in thei
th (0-indexed) condition in the list, and is only produced if thei
th predicate evaluates toTrue
. The data type is the same as the originalNodeOutput
's data type.An output named
output-else
, which emits theNodeOutput
whose label is given byelse_value
. The data type is the same as the originalNodeOutput
's data type.
Setters for node parameters and inputs.
A method to get the NodeOutput
corresponding to the i
th (0-indexed) condition, i.e. outputs()['output-i']
.
A method to get the NodeOutput
corresponding to the else condition, i.e. outputs()['output-else']
.
LogicMergeNode
LogicMergeNode
This node merges together conditional branches that may have been produced by a LogicConditionNode
, returning the output that is the first in the list to have been computed. As above, the documentation on conditional logic may provide helpful context.
Inputs:
inputs
: Different outputs from conditional branches to combine.
Parameters:
None.
Outputs:
output
: The merged output, of data typeUnion[ts]
, wherets
represent the data types of all inputNodeOutput
s.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setter for the node inputs.
SplitTextNode
SplitTextNode
Splits text into multiple strings based on a delimiter.
Inputs:
text_input
: An output containing text to split, which should have data typeText
.
Parameters:
delimiter
: The string on which to split the text. If the text isfoo, bar, baz
anddelimiter
is','
, then the result corresponds to the strings'foo'
,' bar'
, and' baz'
.
Outputs:
output
: All the split strings, of data typeList[Text]
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for the node parameters and inputs.
TimeNode
TimeNode
Outputs a time in text form given a time zone and optional offset.
Inputs:
None.
Parameters:
timezone
: The timezone, which should be in pytz.delta
: The value of a time offset.delta_unit
: The units of a time offset. Should be one of'seconds'
,'minutes'
,'hours'
,'days'
, or'weeks'
.output_format
: The string format in which to output the time. Should be one of'Timestamp'
,'DD/MM/YYYY'
, or'DD-MM-YYYY / HH:MM:SS'.
Outputs:
output
: The string represents the time.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for the node attributes.
Chat
ChatMemoryNode
ChatMemoryNode
Represents the chat memory for chatbots, i.e. the chat history that the chatbot can reference when generating messages.
Inputs:
None.
Parameters:
memory_type
: The particular type of chat memory to use. Options:'
Full - Formatted
'
: The full chat history with formatting to indicate different messages. If this is selected,memory_window
should not be provided.'
Full - Raw
'
: The full chat history without formatting. If this is selected,memory_window
should not be provided.'
Message Buffer
'
: The lastmemory_window
messages. Ifmemory_window
is not specified, defaults to 10.'
Token Buffer
'
: The lastmemory_window
tokens. Ifmemory_window
is not specified, defaults to 2048.
Outputs:
value
: The chat history, of data typeText
ifmemory_type
is'
Full - Formatted
'
andList[Dict]
otherwise.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for node parameters.
DataCollectorNode
DataCollectorNode
Prompts an LLM to search the chat history based on a prompt and one or more example fields, returning a summary of the relevant information.
Inputs:
input
: A NodeOutput which should represent the chat memory. Should be of data typeText
orList[Dict]
(coming from aChatMemoryNode
).
Parameters:
prompt
: The string prompt to guide the kind of data from the chat history to collect.fields
: A list of dictionaries, each indicating a data field to collect. Each dictionary should contain the following fields:field
, containing the name of the field;description
, describing the field; andexample
, giving a text example of what to search for.
Outputs:
output
: A selection of relevant information from the chat history, of data typeText
.
Note: This node returns a single output, so it can be accessed directly via the output()
method.
Setters for node parameters. Arguments for fields should follow the structure as described above.
Last updated