Data Collector

The data collector node allows a chatbot to collect information. It will ask questions to the user in a conversational manner until all the required fields are collected. The node can be found under the "Chat" sub-tab of the pipeline builder.

The node has a LLM inside and it will recursively call itself until all the information is collected.


In the below example, we want the chatbot to collect the name, email, and product they are looking for.

Within the prompt field, specify the instructions for the data collector in natural language. For a simple implementation, we attach an input and an output node to either side of the the data collector (to allow for user queries and for the pipeline to display outputs). Additionally, we click the "Add Field" button twice to enable three fields (3 fields will be collected). Finally, we list out the fields (Name, Email, Product) in the first column, give a description of each in the second, and provide an example of the field in the 3rd column (this is all to provide the LLM context so it understands what kind of data to look for.


You are a lead collection engine. You collect the following pieces of data 
by requesting it from the user. If you have only received the info from the user, 
move on to asking a question to collect the next data point. 

1. Full name of the user
2. Email of the user
3. The product that the user is looking for

After collecting the information, finish by saying "Thank you very much".


By clicking on the gear on the data collector, you will be able to disable "auto generate questions" which makes the prompt field disappear. In order to run the data collector in this case, you will have to pass the contents of the data collector into the prompt of a LLM. If the "auto generate questions" is un-checked, the node will pass out a JSON of the data it has collected in the current run. This is useful if you want a chatbot, for instance, to answer user questions after collecting user information (example below).

Additionally, you will be able to choose the model you use for the data collector (defaulted to GPT-4).


Iterate on the chatbot within the pipeline builder by clicking on "Run Pipeline" and and selecting "Chatbot". You can then run the pipeline in a chat format within the pipeline builder.

Deploy the chatbot through the "Chat" tab or by clicking "Details" within the pipeline builder >> "Deploy As" >> Chatbot.

View collected chatbot data by navigating to the chatbot manager (edit chatbot >> manager) and clicking on "Collected Data".

Lead Generation Chatbot Example

LLM System Prompt

If you receive a Name (e.g., {'Name': 'John Smith'} ), and "How can I help" HAS NOT appeared in Conversation History, respond with "How can I help?".

If "How can I help" has appeared in Conversation History and you have received a Name (e.g., {'Name': 'John Smith'}), answer the Question based on Context (if you are unable to answer Question based on Context, respond with, "I am unable to answer the Question.")

If you do not receive a Name (e.g., if there is no name in Name: {'Name':"}), ask the user: "What is your name?"

Please check again that you have followed all the instructions.

Here, we have the "auto-generate" toggle unchecked and is instructing the data collector to just collect the user's name. However, we also want the chatbot to be able to answer user questions after the name is collected, but to keep asking the user for his/her name until it is received.

View the prompt above to see how we achieve this through prompt engineering.

Last updated