I/O nodes - VectorShift

Add these nodes with the pipeline builder: pipeline.add(name="...").<node>(...). Each entry lists the node’s configuration parameters. See the Pipeline reference for add, run, and lifecycle methods.

`append_files`

Append files together in successive fashion

pipeline.add(name="node").append_files()

Parameters

file_type

str

default:"'PDF'"

One of: PDF, PPTX

selected_files

AcceptsFileList

default:"[]"

`file`

Load a static file into the workflow as a raw File or process it into Text.

Platform docs: Load a static file into the workflow as a raw File or process it into Text.

pipeline.add(name="node").file(file_name="...", file=...)

Parameters

selected_option

str

default:"'upload'"

Select an existing file from the VectorShift platform One of: name, upload

file_name

str

required

The name of the file from the VectorShift platform (for files on the File tab)

file_parser

str

default:"'default'"

The processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page. One of: contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textract

file

AcceptsFile

required

The file that was passed in

`file_operations`

Process and manipulate files

pipeline.add(name="node").file_operations()

Parameters

sub_type

str

default:"''"

`file_save`

Save a file on the VectorShift platform (under the ‘Files’ tab).

Platform docs: Save a file on the VectorShift platform (under the ‘Files’ tab).

pipeline.add(name="node").file_save(files=..., name="...")

Parameters

files

AcceptsFileList

required

name

str

required

`file_to_text`

Convert data from type File to type Text

pipeline.add(name="node").file_to_text(file=...)

Parameters

chunk_text

bool

default:"False"

Whether to chunk the text into smaller pieces.

decrypt

bool

default:"False"

Use a password to decrypt the file.

file

AcceptsFile

required

The file to convert to text.

file_parser

str

default:"'default'"

The type of file parser to use. One of: contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textract

password

str

default:"''"

The password to decrypt the file.

chunk_overlap

int

default:"400"

The overlap of each chunk of text.

chunk_size

int

default:"1024"

The size of each chunk of text.

`get_run_data`

Fetch run metrics (trace, latency, cost, tokens, status) for a session

Platform docs: Fetch run metrics (trace, latency, cost, tokens, status) for a session

pipeline.add(name="node").get_run_data(session_id="...")

Parameters

session_id

str

required

`input`

Pass data of different types into your workflow

Platform docs: Pass data of different types into your workflow

pipeline.add(name="node").input()

Parameters

input_type

str

default:"'string'"

Raw Text

Show Allowed values

agent, audio, bool, dataframe, file, float, image, int32, knowledge_base, pipeline, string, table, timestamp, vec<file>, vec<string>, vec<vec<file>>, vec<vec<string>>

use_default_value

bool

default:"False"

Set default value to be used if no value is provided

description

str

default:"''"

The input description. If pipeline is used as a tool in an agent, the description will be passed to the agent to help the agent know how to fill this input.

default_value

default:"{}"

The default value to be used if no value is provided

dataframe_type

str

default:"'table'"

The type of dataframe to be used One of: csv, dataframe_file, json, md, sql, table

file_parser

str

default:"'default'"

The processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page. Reducto enables rich structured parsing. One of: contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textract

`output`

Output data of different types from your workflow.

Platform docs: Output data of different types from your workflow.

pipeline.add(name="node").output(value=...)

Parameters

output_type

str

default:"'string'"

Show Allowed values

audio, bool, dataframe, file, float, image, int32, json, stream<string>, string, timestamp, vec<file>

description

str

default:"''"

The output description. If pipeline is used as a tool in an agent, the description will be passed to the agent to help the agent know how to fill this output.

format_output

bool

default:"True"

value

required

dataframe_type

str

default:"'table'"

The type of dataframe to be used One of: csv, dataframe_file, json, md, sql, table

`rename_file`

Rename an existing file, assigning a new name along with the file extension.

pipeline.add(name="node").rename_file(file=..., new_name="...")

Parameters

file

AcceptsFile

required

new_name

str

required

`text`

Accepts Text from upstream nodes and allows you to write additional text / concatenate different texts to pass to downstream nodes.

pipeline.add(name="node").text(text="...")

Parameters

text

str

required

`text_to_file`

Convert data from type Text to type File.

pipeline.add(name="node").text_to_file(text="...", file_name="...")

Parameters

text

str

required

file_name

str

required

file_type

str

default:"'PDF'"

One of: DOCX, PDF, TXT

​append_files

​file

​file_operations

​file_save

​file_to_text

​get_run_data

​input

​output

​rename_file

​text

​text_to_file

`append_files`

`file`

`file_operations`

`file_save`

`file_to_text`

`get_run_data`

`input`

`output`

`rename_file`

`text`

`text_to_file`