Data Loaders nodes

Add these nodes with the pipeline builder: pipeline.add(name="...").<node>(...). Each entry lists the node’s configuration parameters. See the Pipeline reference for add, run, and lifecycle methods.

`api` — API

Make an API request to a given URL.

pipeline.add(name="node").api(url="...")

Parameters

is_raw_json

bool

default:"False"

Whether to return the raw JSON response from the API

body_params

ListType | list[KeyValue] | list[List[Dict[str, Any]]]

default:"[]"

The body parameters to include in the API request

fail_non_2xx_response

bool

default:"False"

If enabled, the node fails when the API response status code is not in the 2xx range. Otherwise the response body is returned regardless of status code.

files

ListType | list[FileKeyValue] | list[List[Dict[str, Any]]]

default:"[]"

Files to include in the API request

headers

ListType | list[KeyValue] | list[List[Dict[str, Any]]]

default:"[]"

Headers to include in the API request

method

str

default:"'GET'"

Choose the API Method desired (GET, POST, PUT, DELETE, PATCH) One of: DELETE, GET, PATCH, POST, PUT

query_params

ListType | list[KeyValue] | list[List[Dict[str, Any]]]

default:"[]"

Query parameters to include in the API request

url

str

required

Target URL for the API Request

raw_json

str

default:"''"

The raw JSON request to the API

`arxiv` — Arxiv

Query ARXIV to return relevant articles

pipeline.add(name="node").arxiv(query="...")

Parameters

chunk_text

bool

default:"False"

Whether to chunk the text

query

str

required

The ARXIV query

chunk_overlap

int

default:"0"

The overlap of the chunks

chunk_size

int

default:"512"

The size of the chunks to create

`crunchbase` — Crunchbase

Call the Crunchbase API to look up companies, people, funding rounds, and acquisitions.

pipeline.add(name="node").crunchbase(endpoint="...")

Parameters

body

str

default:"''"

endpoint

str

required

query_params

str

default:"''"

`csv_query` — CSV Query

Utilizes an LLM agent to query CSV(s). Delimeter for the CSV must be commas.

pipeline.add(name="node").csv_query(query="...", csv=...)

Parameters

query

str

required

csv

AcceptsFile

required

stream

bool

default:"False"

`deep_research` — Deep Research

Perform advanced AI research and analysis with specialized model capabilities

Platform docs: Deep Research

pipeline.add(name="node").deep_research(provider="anthropic", api_key="...")

Parameters

provider

str

required

Select the LLM provider for deep research

Show Allowed values

anthropic, azure, bedrock, cohere, custom, fireworks, google, groq, openai, perplexity, together, xai

stream

bool

default:"False"

Whether to stream the research response

use_personal_api_key

bool

default:"False"

Whether to use a personal API key

model

str

default:"''"

Select the LLM model for deep research

Show Allowed values

MiniMaxAI/MiniMax-M2.5, MiniMaxAI/MiniMax-M2.7, Qwen/QwQ-32B-Preview, Qwen/Qwen2.5-72B-Instruct-Turbo-lora, Qwen/Qwen2.5-7B-Instruct-Turbo, Qwen/Qwen3-235B-A22B-Instruct-2507-tput, Qwen/Qwen3-235B-A22B-fp8-tput, Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8, Qwen/Qwen3-VL-8B-Instruct, Qwen/Qwen3.5-397B-A17B, Qwen/Qwen3.5-9B, Qwen/Qwen3.6-Plus, accounts/fireworks/models/deepseek-v4-pro, accounts/fireworks/models/glm-5p1, accounts/fireworks/models/gpt-oss-120b, accounts/fireworks/models/kimi-k2p5, accounts/fireworks/models/kimi-k2p6, accounts/fireworks/models/minimax-m2p7, accounts/fireworks/models/qwen3-235b-a22b, accounts/fireworks/models/qwen3p5-397b-a17b, accounts/fireworks/models/qwen3p6-plus, amazon.nova-lite-v1:0, amazon.nova-micro-v1:0, amazon.nova-pro-v1:0, amazon.titan-text-express-v1, amazon.titan-text-lite-v1, chatgpt-4o-latest, claude-3-5-haiku-20241022, claude-3-7-sonnet-20250219, claude-3-haiku-20240307, claude-haiku-4-5-20251001, claude-opus-4-1-20250805, claude-opus-4-20250514, claude-opus-4-5-20251101, claude-opus-4-6, claude-opus-4-7, claude-opus-4-8, claude-sonnet-4-20250514, claude-sonnet-4-5, claude-sonnet-4-6, command-nightly, command-r-08-2024, command-r-plus-08-2024, deepcogito/cogito-v2-1-671b, deepseek-ai/DeepSeek-R1-Distill-Llama-70B, deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-V4-Pro, deepseek-ai/deepseek-llm-67b-chat, gemini-2.0-flash-001, gemini-2.0-flash-lite-preview-02-05, gemini-2.5-flash, gemini-2.5-pro, gemini-3-flash-preview, gemini-3-pro-preview, gemini-3.1-flash-lite-preview, gemini-3.1-pro-preview, gemini-3.5-flash, gemma2-9b-it, google/gemma-2-27b-it, google/gemma-2-9b-it, google/gemma-2b-it, google/gemma-3n-E4B-it, google/gemma-4-31B-it, gpt-3.5-turbo, gpt-4, gpt-4-turbo, gpt-4-turbo-2024-04-09, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-2024-08-06, gpt-4o-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.2, gpt-5.3-codex, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.5, grok-2, grok-2-vision, grok-3-beta, grok-3-fast-beta, grok-3-mini-beta, grok-3-mini-fast-beta, grok-4, grok-4-0629, grok-4-0709, grok-4-fast-non-reasoning, grok-4-fast-reasoning, grok-4-latest, llama-3.1-8b-instant, llama-3.3-70b-versatile, meta-llama/Llama-3-70b-chat-hf, meta-llama/Llama-3-8b-chat-hf, meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo, meta-llama/Llama-3.2-3B-Instruct-Turbo, meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo, meta-llama/Llama-3.3-70B-Instruct-Turbo, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Meta-Llama-3-70B-Instruct-Lite, meta-llama/Meta-Llama-3-70B-Instruct-Turbo, meta-llama/Meta-Llama-3-8B-Instruct-Lite, meta-llama/Meta-Llama-3-8B-Instruct-Turbo, meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo, meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo, meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo, meta.llama3-8b-instruct-v1:0, mistralai/Mistral-7B-Instruct-v0.1, mistralai/Mistral-7B-Instruct-v0.2, mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mixtral-8x22B-Instruct-v0.1, mistralai/Mixtral-8x7B-Instruct-v0.1, mixtral-8x7b-32768, moonshotai/Kimi-K2-Instruct, moonshotai/Kimi-K2.5, moonshotai/Kimi-K2.6, o1, o3, o3-mini, o4-mini, openai/gpt-oss-120b, openai/gpt-oss-20b, perplexity-ai/r1-1776, r1-1776, sonar, sonar-deep-research, sonar-pro, sonar-reasoning-pro, us.anthropic.claude-haiku-4-5-20251001-v1:0, us.anthropic.claude-opus-4-1-20250805-v1:0, us.anthropic.claude-opus-4-5-20251101-v1:0, us.anthropic.claude-opus-4-6-v1, us.anthropic.claude-sonnet-4-20250514-v1:0, us.anthropic.claude-sonnet-4-5-20250929-v1:0, us.anthropic.claude-sonnet-4-6, us.meta.llama3-1-70b-instruct-v1:0, us.meta.llama3-1-8b-instruct-v1:0, us.meta.llama3-2-11b-instruct-v1:0, us.meta.llama3-2-1b-instruct-v1:0, us.meta.llama3-2-3b-instruct-v1:0, us.meta.llama3-2-90b-instruct-v1:0, zai-org/GLM-4.5-Air-FP8, zai-org/GLM-5, zai-org/GLM-5.1

api_key

str

required

Your personal API key for the research provider

conversation

str

default:"''"

Previous conversation context for research continuity

input

str

default:"''"

The data input for deep research analysis

instructions

str

default:"''"

Specific instructions for the deep research task

max_output_tokens

int

default:"128000"

Maximum number of tokens in the research output

max_tool_calls

int

default:"10"

Maximum number of tool calls during research

parallel_tool_calls

bool

default:"True"

Enable parallel tool execution for faster research

previous_response_id

str

default:"''"

ID of previous response for continuation

prompt_cache_key

str

default:"''"

Cache key for prompt optimization

safety_identifier

str

default:"''"

Safety identifier for content filtering

service_tier

str

default:"''"

Service tier for the research request

tool_choice

str

default:"''"

Specific tool choice for research

tools

ListType | list[List[Dict[str, Any]]] | list[dict]

default:"[]"

Tools to use for research

truncation

str

default:"''"

Truncation strategy for long inputs

deployment_id

str

default:"''"

The deployment ID for the Azure OpenAI model

endpoint

str

default:"''"

The Azure OpenAI endpoint URL

finetuned_model

str

default:"''"

Use your finetuned model for deep research

`exa_ai` — Query the Exa search API

Query the Exa search API

pipeline.add(name="node").exa_ai(query="...")

Parameters

query

str

required

end_crawl_date

str

default:"''"

end_published_date

str

default:"''"

livecrawl

bool

default:"True"

loader_type

str

default:"'EXA_AI_SEARCH'"

One of: EXA_AI_SEARCH, EXA_AI_SEARCH_COMPANIES, EXA_AI_SEARCH_FINANCIAL_REPORTS, EXA_AI_SEARCH_NEWS, EXA_AI_SEARCH_PEOPLE, EXA_AI_SEARCH_PERSONAL_SITES, EXA_AI_SEARCH_RESEARCH_PAPERS, EXA_AI_SEARCH_TWEETS

max_characters

int

default:"2000"

num_results

int

default:"10"

search_type

str

default:"'auto'"

One of: auto, deep, deep-max, deep-reasoning, fast, instant, neural

start_crawl_date

str

default:"''"

start_published_date

str

default:"''"

use_highlights

bool

default:"False"

`fetch_filings` — Fetch Filings

Search for financial filings (10-K, 10-Q) by stock ticker

pipeline.add(name="node").fetch_filings(tickers="...")

Parameters

document_group_ids

str

default:"'3,4'"

end_date

str

default:"''"

limit

int

default:"5"

loader_type

str

default:"'filings'"

One of: filings

start_date

str

default:"''"

tickers

str

required

`fetch_financials` — Fetch Financials

Fetch income statements, balance sheets, or cash flow statements with SEC filing citations

pipeline.add(name="node").fetch_financials(statement_type="balance-sheet", tickers="...")

Parameters

currency

str

default:"''"

limit

int

default:"5"

loader_type

str

default:"'financials'"

One of: financials

metrics

str

default:"''"

period_type

str

default:"'annual'"

One of: annual, latest, ltm, quarterly

statement_type

str

required

One of: balance-sheet, cash-flow-statement, income-statement

tickers

str

required

`fetch_logos` — Fetch Logos

Fetch company logos by ticker or domain

pipeline.add(name="node").fetch_logos(tickers="...")

Parameters

domains

str

default:"''"

format

str

default:"'png'"

One of: jpg, png, webp

retina

bool

default:"True"

size

int

default:"256"

theme

str

default:"'light'"

One of: dark, light

tickers

str

required

`fetch_ratios` — Fetch Ratios

Fetch financial ratios (P/E, EV/EBITDA, margins, growth) for one or more companies

pipeline.add(name="node").fetch_ratios(tickers="...")

Parameters

currency

str

default:"''"

daily

str

default:"'false'"

loader_type

str

default:"'ratios'"

One of: ratios

period_type

str

default:"'annual'"

One of: annual, latest, quarterly

ratio_ids

str

default:"''"

tickers

str

required

`fetch_slides` — Fetch Slides

Search for investor presentation slides by stock ticker

pipeline.add(name="node").fetch_slides(tickers="...")

Parameters

document_group_ids

str

default:"''"

end_date

str

default:"''"

limit

int

default:"1"

loader_type

str

default:"'slides'"

One of: slides

start_date

str

default:"''"

tickers

str

required

`fetch_stock_prices` — Fetch Stock Prices

Fetch historical stock prices for one or more companies

pipeline.add(name="node").fetch_stock_prices(tickers="...")

Parameters

end_date

str

default:"''"

loader_type

str

default:"'stock_prices'"

One of: stock_prices

start_date

str

default:"''"

tickers

str

required

`fetch_transcripts` — Fetch Transcripts

Search for earnings call transcripts by stock ticker

pipeline.add(name="node").fetch_transcripts(tickers="...")

Parameters

end_date

str

default:"''"

limit

int

default:"5"

loader_type

str

default:"'transcripts'"

One of: transcripts

start_date

str

default:"''"

tickers

str

required

`file` — File

Load a static file into the workflow as a raw File or process it into Text.

Platform docs: File

pipeline.add(name="node").file(file_name="...", file=...)

Parameters

selected_option

str

default:"'upload'"

Select an existing file from the VectorShift platform One of: name, upload

file_name

str

required

The name of the file from the VectorShift platform (for files on the File tab)

file_parser

str

default:"'default'"

The processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page. One of: contextual_ai, default, docling, llama_parse, mistral_ocr, reducto, textract

file

AcceptsFile

required

The file that was passed in

`google_alert_rss_reader` — Google Alert RSS Reader

Read the contents from an RSS feed created from a Google Alert: https://www.google.com/alerts

pipeline.add(name="node").google_alert_rss_reader(feed_link="...")

Parameters

feed_link

str

required

timeframe

str

default:"'all'"

One of: all, past day, past month, past week

`google_search` — Google Search

Query the Google Search search API

pipeline.add(name="node").google_search(query="...")

Parameters

query

str

required

location

str

default:"'us'"

Show Allowed values

ad, ae, af, ag, ai, al, am, an, ao, aq, ar, as, at, au, aw, az, ba, bb, bd, be, bf, bg, bh, bi, bj, bm, bn, bo, br, bs, bt, bv, bw, by, bz, ca, cc, cd, cf, cg, ch, ci, ck, cl, cm, cn, co, cr, cs, cu, cv, cx, cy, cz, de, dj, dk, dm, do, dz, ec, ee, eg, eh, er, es, et, fi, fj, fk, fm, fo, fr, ga, gd, ge, gf, gh, gi, gl, gm, gn, gp, gq, gr, gs, gt, gu, gw, gy, hk, hm, hn, hr, ht, hu, id, ie, il, in, io, iq, ir, is, it, jm, jo, jp, ke, kg, kh, ki, km, kn, kp, kr, kw, ky, kz, la, lb, lc, li, lk, lr, ls, lt, lu, lv, ly, ma, mc, md, mg, mh, mk, ml, mm, mn, mo, mp, mq, mr, ms, mt, mu, mv, mw, mx, my, mz, na, nc, ne, nf, ng, ni, nl, no, np, nr, nu, nz, om, pa, pe, pf, pg, ph, pk, pl, pm, pn, pr, ps, pt, pw, py, qa, re, ro, ru, rw, sa, sb, sc, sd, se, sg, sh, si, sj, sk, sl, sm, sn, so, sr, st, sv, sy, sz, tc, td, tf, tg, th, tj, tk, tl, tm, tn, to, tr, tt, tv, tw, tz, ua, ug, uk, um, us, uy, uz, va, vc, ve, vg, vi, vn, vu, wf, ws, ye, yt, za, zm, zw

num_results

int

default:"10"

search_type

str

default:"'web'"

One of: events, hotels, image, news, web

`parallel_ai_search` — Parallel AI Search

Search the web with Parallel’s Search API

pipeline.add(name="node").parallel_ai_search()

Parameters

blocked_domains

list[str]

default:"[]"

excerpts_max_chars_per_result

int

default:"1500"

fetch_live_results

bool

default:"False"

max_chars_per_result

int

default:"1500"

max_results

int

default:"10"

mode

str

default:"'one-shot'"

One of: agentic, one-shot

objective

str

default:"''"

preferred_domains

list[str]

default:"[]"

processor

str

default:"''"

One of: “, base, pro

search_queries

list[str]

default:"[]"

`perplexity_search` — Query the Perplexity search API

Query the Perplexity search API

pipeline.add(name="node").perplexity_search(query="...")

Parameters

query

str

required

last_updated_after_filter

str

default:"''"

last_updated_before_filter

str

default:"''"

max_results

int

default:"10"

max_tokens_per_page

int

default:"1000"

query_list

list[str]

default:"[]"

return_images

bool

default:"False"

return_snippets

bool

default:"True"

search_after_date_filter

str

default:"''"

search_before_date_filter

str

default:"''"

search_domain_filter

list[str]

default:"[]"

search_mode

str

default:"''"

One of: “, academic, web

search_recency_filter

str

default:"''"

One of: “, day, month, week, year

user_location_latitude

str

default:"''"

user_location_longitude

str

default:"''"

user_location_name

str

default:"''"

user_location_radius_km

str

default:"''"

`reducto_extract` — Reducto Extract

Extract structured data from documents using Reducto

pipeline.add(name="node").reducto_extract(json_schema="...", system_prompt="...")

Parameters

use_personal_api_key

bool

default:"False"

Use your own Reducto API key instead of the platform default.

return_citations

bool

default:"False"

Return citation bounding boxes for extracted fields.

api_key

str

default:"''"

Your personal Reducto API key.

deep_extract

bool

default:"True"

Enable agentic deep extraction for higher accuracy. Uses iterative verification against the source material.

files

AcceptsFileList

default:"[]"

Documents to extract data from (up to 2,500 pages per document).

json_schema

str

required

A JSON schema defining the structure of data to extract. Use descriptive field names.

system_prompt

str

required

Instructions for how the AI should extract and verify data from the documents.

`rss` — RSS

RSS

pipeline.add(name="node").rss()

Parameters

sub_type

str

default:"''"

`rss_feed_reader` — RSS Feed Reader

Read the contents from an RSS feed

pipeline.add(name="node").rss_feed_reader(url="...")

Parameters

entries

int

default:"10"

timeframe

str

default:"'all'"

One of: all, past day, past month, past week

url

str

required

`serp_api` — Serp API

Query the SERPAPI Google search API

pipeline.add(name="node").serp_api(query="...", api_key="...")

Parameters

query

str

required

api_key

str

required

`url_loader` — URL Loader

Scrape the contents from a URL

pipeline.add(name="node").url_loader(api_key="...", url="...")

Parameters

provider

str

default:"'jina'"

The provider to use for the URL Scraper One of: apify, jina, modal

use_actions

bool

default:"True"

Perform browser actions to interact with the input website

recursive

bool

default:"False"

Whether to recursively load the URL

api_key

str

required

The API key to use

url

str

required

The URL to load

url_limit

int

default:"10"

The maximum number of URLs to load

actions

AcceptsAnyList | ListType | list[AcceptsAnyList]

default:"[]"

The browser actions to perform on the URL

ai_enhance_content

bool

default:"False"

Whether to enhance the content

load_sitemap

bool

default:"False"

Load URLs to crawl from a sitemap. If the URL is a sitemap, it will be used directly. If the URL is not a sitemap, the sitemap will be fetched automatically.

use_proxy

bool

default:"False"

Use a proxy to crawl the website

`web` — Web Search

Web Search

pipeline.add(name="node").web()

Parameters

sub_type

str

default:"''"

`wikipedia` — Wikipedia

Query Wikipedia to return relevant articles

pipeline.add(name="node").wikipedia(query="...")

Parameters

chunk_text

bool

default:"False"

Whether to chunk the text

query

str

required

The Wikipedia query

chunk_overlap

int

default:"0"

The overlap of the chunks

chunk_size

int

default:"512"

The size of the chunks to create

`you_dot_com` — Query the You.com search API

Query the You.com search API

pipeline.add(name="node").you_dot_com(query="...", api_key="...")

Parameters

loader_type

str

default:"'YOU_DOT_COM'"

Select the loader type: General or News One of: YOU_DOT_COM, YOU_DOT_COM_NEWS

query

str

required

The search query

api_key

str

required

You.com API key

`youtube` — Youtube

Get the transcript of a youtube video.

pipeline.add(name="node").youtube(url="...")

Parameters

chunk_text

bool

default:"False"

Whether to chunk the text

url

str

required

The YouTube URL to get the transcript of

chunk_overlap

int

default:"0"

The overlap of the chunks

chunk_size

int

default:"512"

The size of the chunks to create

Get started

Guides

Pipeline

Agent

Knowledge Base

Integrations

Table

Transformation

Session

Analytics

Workspace

`api` — API

`arxiv` — Arxiv

`crunchbase` — Crunchbase

`csv_query` — CSV Query

`deep_research` — Deep Research

`exa_ai` — Query the Exa search API

`fetch_filings` — Fetch Filings

`fetch_financials` — Fetch Financials

`fetch_logos` — Fetch Logos

`fetch_ratios` — Fetch Ratios

`fetch_slides` — Fetch Slides

`fetch_stock_prices` — Fetch Stock Prices

`fetch_transcripts` — Fetch Transcripts

`file` — File

`google_alert_rss_reader` — Google Alert RSS Reader

`google_search` — Google Search

`parallel_ai_search` — Parallel AI Search

`perplexity_search` — Query the Perplexity search API

`reducto_extract` — Reducto Extract

`rss` — RSS

`rss_feed_reader` — RSS Feed Reader

`serp_api` — Serp API

`url_loader` — URL Loader

`web` — Web Search

`wikipedia` — Wikipedia

`you_dot_com` — Query the You.com search API

`youtube` — Youtube

​api — API

​arxiv — Arxiv

​crunchbase — Crunchbase

​csv_query — CSV Query

​deep_research — Deep Research

​exa_ai — Query the Exa search API

​fetch_filings — Fetch Filings

​fetch_financials — Fetch Financials

​fetch_logos — Fetch Logos

​fetch_ratios — Fetch Ratios

​fetch_slides — Fetch Slides

​fetch_stock_prices — Fetch Stock Prices

​fetch_transcripts — Fetch Transcripts

​file — File

​google_alert_rss_reader — Google Alert RSS Reader

​google_search — Google Search

​parallel_ai_search — Parallel AI Search

​perplexity_search — Query the Perplexity search API

​reducto_extract — Reducto Extract

​rss — RSS

​rss_feed_reader — RSS Feed Reader

​serp_api — Serp API

​url_loader — URL Loader

​web — Web Search

​wikipedia — Wikipedia

​you_dot_com — Query the You.com search API

​youtube — Youtube

`api` — API

`arxiv` — Arxiv

`crunchbase` — Crunchbase

`csv_query` — CSV Query

`deep_research` — Deep Research

`exa_ai` — Query the Exa search API

`fetch_filings` — Fetch Filings

`fetch_financials` — Fetch Financials

`fetch_logos` — Fetch Logos

`fetch_ratios` — Fetch Ratios

`fetch_slides` — Fetch Slides

`fetch_stock_prices` — Fetch Stock Prices

`fetch_transcripts` — Fetch Transcripts

`file` — File

`google_alert_rss_reader` — Google Alert RSS Reader

`google_search` — Google Search

`parallel_ai_search` — Parallel AI Search

`perplexity_search` — Query the Perplexity search API

`reducto_extract` — Reducto Extract

`rss` — RSS

`rss_feed_reader` — RSS Feed Reader

`serp_api` — Serp API

`url_loader` — URL Loader

`web` — Web Search

`wikipedia` — Wikipedia

`you_dot_com` — Query the You.com search API

`youtube` — Youtube