pipeline.add(name="...").<node>(...). Each entry lists the node’s configuration parameters. See the Pipeline reference for add, run, and lifecycle methods.
append_files
Append files together in successive fashion
One of:
PDF, PPTXfile
Load a static file into the workflow as a raw File or process it into Text.
Select an existing file from the VectorShift platform
One of:
name, uploadThe name of the file from the VectorShift platform (for files on the File tab)
The processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page.
One of:
contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textractThe file that was passed in
file_operations
Process and manipulate files
file_save
Save a file on the VectorShift platform (under the ‘Files’ tab).
file_to_text
Convert data from type File to type Text
Whether to chunk the text into smaller pieces.
Use a password to decrypt the file.
The file to convert to text.
The type of file parser to use.
One of:
contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textractThe password to decrypt the file.
The overlap of each chunk of text.
The size of each chunk of text.
get_run_data
Fetch run metrics (trace, latency, cost, tokens, status) for a session
input
Pass data of different types into your workflow
Platform docs: Pass data of different types into your workflow
Raw Text
Set default value to be used if no value is provided
The input description. If pipeline is used as a tool in an agent, the description will be passed to the agent to help the agent know how to fill this input.
default_value
AcceptsAgent | AcceptsAudio | AcceptsDataframe | AcceptsFile | AcceptsFileList | AcceptsImage | AcceptsKnowledgeBase | AcceptsPipeline | AcceptsTable | AcceptsTimestamp | ListType | bool | float | int | list[AcceptsFileList] | list[list[str]] | list[str] | str
default:"{}"
The default value to be used if no value is provided
The type of dataframe to be used
One of:
csv, dataframe_file, json, md, sql, tableThe processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page. Reducto enables rich structured parsing.
One of:
contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textractoutput
Output data of different types from your workflow.
Platform docs: Output data of different types from your workflow.
The output description. If pipeline is used as a tool in an agent, the description will be passed to the agent to help the agent know how to fill this output.
value
AcceptsAudio | AcceptsDataframe | AcceptsFile | AcceptsFileList | AcceptsImage | AcceptsStream | AcceptsTimestamp | bool | float | int | str
required
The type of dataframe to be used
One of:
csv, dataframe_file, json, md, sql, tablerename_file
Rename an existing file, assigning a new name along with the file extension.
text
Accepts Text from upstream nodes and allows you to write additional text / concatenate different texts to pass to downstream nodes.
text_to_file
Convert data from type Text to type File.
One of:
DOCX, PDF, TXT