Scrape URL Node
Last updated
Last updated
This node allows you to scrape content from a provided URL.
URL: The URL of the webpage you want scraped
Type: Text
Face of the node
Provider: The provider you want to use for scraping. The available providers are: Default (Combination of multiple providers), Jina and Apify.
Recursive: This will enable the recursive scraping of URLs that have the same base URL. For example, https://vectorshift.ai/ and https://vectorshift.ai/enterprise all have the same based URL of https://vectorshift.ai/.
URL Limit (if Recursive is set to true): The maximum limit for number URLs to scrape in recursion
In the gear:
Use Personal Api Key: This allows you to enter an API key (note APIFY requires API key).
AI Enhance Content: Content from the website will be passed to an LLM to clean it up.
Content: The raw text from the website
Type: Text
Example usage: {{url_loader_0.content}}
Click on the “+” button on the right of the node to create and connect the node to a semantic node.
The semantic search node is a commonly connected node as you may want to embed the data for querying.
Do not share your API key with someone that you do not trust.