> ## Documentation Index > Fetch the complete documentation index at: https://docs.vectorshift.ai/llms.txt > Use this file to discover all available pages before exploring further. # Data Loaders nodes > Load from files, the web, search providers, RSS, and external data APIs. Add these nodes with the pipeline builder: `pipeline.add(name="...").(...)`. Each entry lists the node's configuration parameters. See the [Pipeline reference](/sdk/pipeline/reference) for `add`, `run`, and lifecycle methods. ## `api` — API Make an API request to a given URL. ```python Sync theme={"languages":{}} pipeline.add(name="node").api(url="...") ``` **Parameters** Whether to return the raw JSON response from the API The body parameters to include in the API request If enabled, the node fails when the API response status code is not in the 2xx range. Otherwise the response body is returned regardless of status code. Files to include in the API request Headers to include in the API request Choose the API Method desired (GET, POST, PUT, DELETE, PATCH) One of: `DELETE`, `GET`, `PATCH`, `POST`, `PUT` Query parameters to include in the API request Target URL for the API Request The raw JSON request to the API ## `arxiv` — Arxiv Query ARXIV to return relevant articles ```python Sync theme={"languages":{}} pipeline.add(name="node").arxiv(query="...") ``` **Parameters** Whether to chunk the text The ARXIV query The overlap of the chunks The size of the chunks to create ## `crunchbase` — Crunchbase Call the Crunchbase API to look up companies, people, funding rounds, and acquisitions. ```python Sync theme={"languages":{}} pipeline.add(name="node").crunchbase(endpoint="...") ``` **Parameters** ## `csv_query` — CSV Query Utilizes an LLM agent to query CSV(s). Delimeter for the CSV must be commas. ```python Sync theme={"languages":{}} pipeline.add(name="node").csv_query(query="...", csv=...) ``` **Parameters** ## `deep_research` — Deep Research Perform advanced AI research and analysis with specialized model capabilities Platform docs: [Deep Research](/nodes/deep-research/overview) ```python Sync theme={"languages":{}} pipeline.add(name="node").deep_research(provider="anthropic", api_key="...") ``` **Parameters** Select the LLM provider for deep research `anthropic`, `azure`, `bedrock`, `cohere`, `custom`, `fireworks`, `google`, `groq`, `openai`, `perplexity`, `together`, `xai` Whether to stream the research response Whether to use a personal API key Select the LLM model for deep research `MiniMaxAI/MiniMax-M2.5`, `MiniMaxAI/MiniMax-M2.7`, `Qwen/QwQ-32B-Preview`, `Qwen/Qwen2.5-72B-Instruct-Turbo-lora`, `Qwen/Qwen2.5-7B-Instruct-Turbo`, `Qwen/Qwen3-235B-A22B-Instruct-2507-tput`, `Qwen/Qwen3-235B-A22B-fp8-tput`, `Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8`, `Qwen/Qwen3-VL-8B-Instruct`, `Qwen/Qwen3.5-397B-A17B`, `Qwen/Qwen3.5-9B`, `Qwen/Qwen3.6-Plus`, `accounts/fireworks/models/deepseek-v4-pro`, `accounts/fireworks/models/glm-5p1`, `accounts/fireworks/models/gpt-oss-120b`, `accounts/fireworks/models/kimi-k2p5`, `accounts/fireworks/models/kimi-k2p6`, `accounts/fireworks/models/minimax-m2p7`, `accounts/fireworks/models/qwen3-235b-a22b`, `accounts/fireworks/models/qwen3p5-397b-a17b`, `accounts/fireworks/models/qwen3p6-plus`, `amazon.nova-lite-v1:0`, `amazon.nova-micro-v1:0`, `amazon.nova-pro-v1:0`, `amazon.titan-text-express-v1`, `amazon.titan-text-lite-v1`, `chatgpt-4o-latest`, `claude-3-5-haiku-20241022`, `claude-3-7-sonnet-20250219`, `claude-3-haiku-20240307`, `claude-haiku-4-5-20251001`, `claude-opus-4-1-20250805`, `claude-opus-4-20250514`, `claude-opus-4-5-20251101`, `claude-opus-4-6`, `claude-opus-4-7`, `claude-opus-4-8`, `claude-sonnet-4-20250514`, `claude-sonnet-4-5`, `claude-sonnet-4-6`, `command-nightly`, `command-r-08-2024`, `command-r-plus-08-2024`, `deepcogito/cogito-v2-1-671b`, `deepseek-ai/DeepSeek-R1-Distill-Llama-70B`, `deepseek-ai/DeepSeek-V3`, `deepseek-ai/DeepSeek-V4-Pro`, `deepseek-ai/deepseek-llm-67b-chat`, `gemini-2.0-flash-001`, `gemini-2.0-flash-lite-preview-02-05`, `gemini-2.5-flash`, `gemini-2.5-pro`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3.1-pro-preview`, `gemini-3.5-flash`, `gemma2-9b-it`, `google/gemma-2-27b-it`, `google/gemma-2-9b-it`, `google/gemma-2b-it`, `google/gemma-3n-E4B-it`, `google/gemma-4-31B-it`, `gpt-3.5-turbo`, `gpt-4`, `gpt-4-turbo`, `gpt-4-turbo-2024-04-09`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-4o`, `gpt-4o-2024-08-06`, `gpt-4o-mini`, `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`, `gpt-5.2`, `gpt-5.3-codex`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5.5`, `grok-2`, `grok-2-vision`, `grok-3-beta`, `grok-3-fast-beta`, `grok-3-mini-beta`, `grok-3-mini-fast-beta`, `grok-4`, `grok-4-0629`, `grok-4-0709`, `grok-4-fast-non-reasoning`, `grok-4-fast-reasoning`, `grok-4-latest`, `llama-3.1-8b-instant`, `llama-3.3-70b-versatile`, `meta-llama/Llama-3-70b-chat-hf`, `meta-llama/Llama-3-8b-chat-hf`, `meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo`, `meta-llama/Llama-3.2-3B-Instruct-Turbo`, `meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo`, `meta-llama/Llama-3.3-70B-Instruct-Turbo`, `meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8`, `meta-llama/Llama-4-Scout-17B-16E-Instruct`, `meta-llama/Meta-Llama-3-70B-Instruct-Lite`, `meta-llama/Meta-Llama-3-70B-Instruct-Turbo`, `meta-llama/Meta-Llama-3-8B-Instruct-Lite`, `meta-llama/Meta-Llama-3-8B-Instruct-Turbo`, `meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo`, `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo`, `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo`, `meta.llama3-8b-instruct-v1:0`, `mistralai/Mistral-7B-Instruct-v0.1`, `mistralai/Mistral-7B-Instruct-v0.2`, `mistralai/Mistral-7B-Instruct-v0.3`, `mistralai/Mixtral-8x22B-Instruct-v0.1`, `mistralai/Mixtral-8x7B-Instruct-v0.1`, `mixtral-8x7b-32768`, `moonshotai/Kimi-K2-Instruct`, `moonshotai/Kimi-K2.5`, `moonshotai/Kimi-K2.6`, `o1`, `o3`, `o3-mini`, `o4-mini`, `openai/gpt-oss-120b`, `openai/gpt-oss-20b`, `perplexity-ai/r1-1776`, `r1-1776`, `sonar`, `sonar-deep-research`, `sonar-pro`, `sonar-reasoning-pro`, `us.anthropic.claude-haiku-4-5-20251001-v1:0`, `us.anthropic.claude-opus-4-1-20250805-v1:0`, `us.anthropic.claude-opus-4-5-20251101-v1:0`, `us.anthropic.claude-opus-4-6-v1`, `us.anthropic.claude-sonnet-4-20250514-v1:0`, `us.anthropic.claude-sonnet-4-5-20250929-v1:0`, `us.anthropic.claude-sonnet-4-6`, `us.meta.llama3-1-70b-instruct-v1:0`, `us.meta.llama3-1-8b-instruct-v1:0`, `us.meta.llama3-2-11b-instruct-v1:0`, `us.meta.llama3-2-1b-instruct-v1:0`, `us.meta.llama3-2-3b-instruct-v1:0`, `us.meta.llama3-2-90b-instruct-v1:0`, `zai-org/GLM-4.5-Air-FP8`, `zai-org/GLM-5`, `zai-org/GLM-5.1` Your personal API key for the research provider Previous conversation context for research continuity The data input for deep research analysis Specific instructions for the deep research task Maximum number of tokens in the research output Maximum number of tool calls during research Enable parallel tool execution for faster research ID of previous response for continuation Cache key for prompt optimization Safety identifier for content filtering Service tier for the research request Specific tool choice for research Tools to use for research Truncation strategy for long inputs The deployment ID for the Azure OpenAI model The Azure OpenAI endpoint URL Use your finetuned model for deep research ## `exa_ai` — Query the Exa search API Query the Exa search API ```python Sync theme={"languages":{}} pipeline.add(name="node").exa_ai(query="...") ``` **Parameters** One of: `EXA_AI_SEARCH`, `EXA_AI_SEARCH_COMPANIES`, `EXA_AI_SEARCH_FINANCIAL_REPORTS`, `EXA_AI_SEARCH_NEWS`, `EXA_AI_SEARCH_PEOPLE`, `EXA_AI_SEARCH_PERSONAL_SITES`, `EXA_AI_SEARCH_RESEARCH_PAPERS`, `EXA_AI_SEARCH_TWEETS` One of: `auto`, `deep`, `deep-max`, `deep-reasoning`, `fast`, `instant`, `neural` ## `fetch_filings` — Fetch Filings Search for financial filings (10-K, 10-Q) by stock ticker ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_filings(tickers="...") ``` **Parameters** One of: `filings` ## `fetch_financials` — Fetch Financials Fetch income statements, balance sheets, or cash flow statements with SEC filing citations ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_financials(statement_type="balance-sheet", tickers="...") ``` **Parameters** One of: `financials` One of: `annual`, `latest`, `ltm`, `quarterly` One of: `balance-sheet`, `cash-flow-statement`, `income-statement` ## `fetch_logos` — Fetch Logos Fetch company logos by ticker or domain ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_logos(tickers="...") ``` **Parameters** One of: `jpg`, `png`, `webp` One of: `dark`, `light` ## `fetch_ratios` — Fetch Ratios Fetch financial ratios (P/E, EV/EBITDA, margins, growth) for one or more companies ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_ratios(tickers="...") ``` **Parameters** One of: `ratios` One of: `annual`, `latest`, `quarterly` ## `fetch_slides` — Fetch Slides Search for investor presentation slides by stock ticker ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_slides(tickers="...") ``` **Parameters** One of: `slides` ## `fetch_stock_prices` — Fetch Stock Prices Fetch historical stock prices for one or more companies ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_stock_prices(tickers="...") ``` **Parameters** One of: `stock_prices` ## `fetch_transcripts` — Fetch Transcripts Search for earnings call transcripts by stock ticker ```python Sync theme={"languages":{}} pipeline.add(name="node").fetch_transcripts(tickers="...") ``` **Parameters** One of: `transcripts` ## `file` — File Load a static file into the workflow as a raw File or process it into Text. Platform docs: [File](/nodes/file/overview) ```python Sync theme={"languages":{}} pipeline.add(name="node").file(file_name="...", file=...) ``` **Parameters** Select an existing file from the VectorShift platform One of: `name`, `upload` The name of the file from the VectorShift platform (for files on the File tab) The processing model with which the document will be processed. Default processing model includes standard document parsing / OCR. Llamaparse will allow for ability to read documents with complex features (e.g., tables, charts, etc.). Llamaparse will be charged at 0.3 cents per page. Textract for most advanced data extraction and will be charged at 1.5 cents per page. One of: `contextual_ai`, `default`, `docling`, `llama_parse`, `mistral_ocr`, `reducto`, `textract` The file that was passed in ## `google_alert_rss_reader` — Google Alert RSS Reader Read the contents from an RSS feed created from a Google Alert: [https://www.google.com/alerts](https://www.google.com/alerts) ```python Sync theme={"languages":{}} pipeline.add(name="node").google_alert_rss_reader(feed_link="...") ``` **Parameters** One of: `all`, `past day`, `past month`, `past week` ## `google_search` — Google Search Query the Google Search search API ```python Sync theme={"languages":{}} pipeline.add(name="node").google_search(query="...") ``` **Parameters** `ad`, `ae`, `af`, `ag`, `ai`, `al`, `am`, `an`, `ao`, `aq`, `ar`, `as`, `at`, `au`, `aw`, `az`, `ba`, `bb`, `bd`, `be`, `bf`, `bg`, `bh`, `bi`, `bj`, `bm`, `bn`, `bo`, `br`, `bs`, `bt`, `bv`, `bw`, `by`, `bz`, `ca`, `cc`, `cd`, `cf`, `cg`, `ch`, `ci`, `ck`, `cl`, `cm`, `cn`, `co`, `cr`, `cs`, `cu`, `cv`, `cx`, `cy`, `cz`, `de`, `dj`, `dk`, `dm`, `do`, `dz`, `ec`, `ee`, `eg`, `eh`, `er`, `es`, `et`, `fi`, `fj`, `fk`, `fm`, `fo`, `fr`, `ga`, `gd`, `ge`, `gf`, `gh`, `gi`, `gl`, `gm`, `gn`, `gp`, `gq`, `gr`, `gs`, `gt`, `gu`, `gw`, `gy`, `hk`, `hm`, `hn`, `hr`, `ht`, `hu`, `id`, `ie`, `il`, `in`, `io`, `iq`, `ir`, `is`, `it`, `jm`, `jo`, `jp`, `ke`, `kg`, `kh`, `ki`, `km`, `kn`, `kp`, `kr`, `kw`, `ky`, `kz`, `la`, `lb`, `lc`, `li`, `lk`, `lr`, `ls`, `lt`, `lu`, `lv`, `ly`, `ma`, `mc`, `md`, `mg`, `mh`, `mk`, `ml`, `mm`, `mn`, `mo`, `mp`, `mq`, `mr`, `ms`, `mt`, `mu`, `mv`, `mw`, `mx`, `my`, `mz`, `na`, `nc`, `ne`, `nf`, `ng`, `ni`, `nl`, `no`, `np`, `nr`, `nu`, `nz`, `om`, `pa`, `pe`, `pf`, `pg`, `ph`, `pk`, `pl`, `pm`, `pn`, `pr`, `ps`, `pt`, `pw`, `py`, `qa`, `re`, `ro`, `ru`, `rw`, `sa`, `sb`, `sc`, `sd`, `se`, `sg`, `sh`, `si`, `sj`, `sk`, `sl`, `sm`, `sn`, `so`, `sr`, `st`, `sv`, `sy`, `sz`, `tc`, `td`, `tf`, `tg`, `th`, `tj`, `tk`, `tl`, `tm`, `tn`, `to`, `tr`, `tt`, `tv`, `tw`, `tz`, `ua`, `ug`, `uk`, `um`, `us`, `uy`, `uz`, `va`, `vc`, `ve`, `vg`, `vi`, `vn`, `vu`, `wf`, `ws`, `ye`, `yt`, `za`, `zm`, `zw` One of: `events`, `hotels`, `image`, `news`, `web` ## `parallel_ai_search` — Parallel AI Search Search the web with Parallel's Search API ```python Sync theme={"languages":{}} pipeline.add(name="node").parallel_ai_search() ``` **Parameters** One of: `agentic`, `one-shot` One of: \`\`, `base`, `pro` ## `perplexity_search` — Query the Perplexity search API Query the Perplexity search API ```python Sync theme={"languages":{}} pipeline.add(name="node").perplexity_search(query="...") ``` **Parameters** One of: \`\`, `academic`, `web` One of: \`\`, `day`, `month`, `week`, `year` ## `reducto_extract` — Reducto Extract Extract structured data from documents using Reducto ```python Sync theme={"languages":{}} pipeline.add(name="node").reducto_extract(json_schema="...", system_prompt="...") ``` **Parameters** Use your own Reducto API key instead of the platform default. Return citation bounding boxes for extracted fields. Your personal Reducto API key. Enable agentic deep extraction for higher accuracy. Uses iterative verification against the source material. Documents to extract data from (up to 2,500 pages per document). A JSON schema defining the structure of data to extract. Use descriptive field names. Instructions for how the AI should extract and verify data from the documents. ## `rss` — RSS RSS ```python Sync theme={"languages":{}} pipeline.add(name="node").rss() ``` **Parameters** ## `rss_feed_reader` — RSS Feed Reader Read the contents from an RSS feed ```python Sync theme={"languages":{}} pipeline.add(name="node").rss_feed_reader(url="...") ``` **Parameters** One of: `all`, `past day`, `past month`, `past week` ## `serp_api` — Serp API Query the SERPAPI Google search API ```python Sync theme={"languages":{}} pipeline.add(name="node").serp_api(query="...", api_key="...") ``` **Parameters** ## `url_loader` — URL Loader Scrape the contents from a URL ```python Sync theme={"languages":{}} pipeline.add(name="node").url_loader(api_key="...", url="...") ``` **Parameters** The provider to use for the URL Scraper One of: `apify`, `jina`, `modal` Perform browser actions to interact with the input website Whether to recursively load the URL The API key to use The URL to load The maximum number of URLs to load The browser actions to perform on the URL Whether to enhance the content Load URLs to crawl from a sitemap. If the URL is a sitemap, it will be used directly. If the URL is not a sitemap, the sitemap will be fetched automatically. Use a proxy to crawl the website ## `web` — Web Search Web Search ```python Sync theme={"languages":{}} pipeline.add(name="node").web() ``` **Parameters** ## `wikipedia` — Wikipedia Query Wikipedia to return relevant articles ```python Sync theme={"languages":{}} pipeline.add(name="node").wikipedia(query="...") ``` **Parameters** Whether to chunk the text The Wikipedia query The overlap of the chunks The size of the chunks to create ## `you_dot_com` — Query the You.com search API Query the You.com search API ```python Sync theme={"languages":{}} pipeline.add(name="node").you_dot_com(query="...", api_key="...") ``` **Parameters** Select the loader type: General or News One of: `YOU_DOT_COM`, `YOU_DOT_COM_NEWS` The search query You.com API key ## `youtube` — Youtube Get the transcript of a youtube video. ```python Sync theme={"languages":{}} pipeline.add(name="node").youtube(url="...") ``` **Parameters** Whether to chunk the text The YouTube URL to get the transcript of The overlap of the chunks The size of the chunks to create