Analyst

Pipeline that mimics complex logical thinking

Scenario: I am an analyst at a financial institution and want to produce an assistant that helps me answer complex questions logically based on certain files.

At a high level, we need:

  1. A way for the users to input files and embed them into a semantic search database.

  2. A way for the user to ask a question and for an LLM to “break down” the broader question into “sub-questions” (this allows for more detailed responses from the LLM as more context will be queried from the semantic database using these sub-questions).

  3. Allow for each of the subquestions to query a semantic search database and for each to return relevant context.

  4. A way for a second LLM to receive the relevant information queried from the semantic database to synthesize an answer.

Step 1 - Open the Pipeline Builder

Click “New” >> "Create a pipeline" within the “Pipeline” tab.

Step 2 - Embed Files

Allow for end users to feed in files and for them to be embedded in a semantic search database:

  1. Utilize an Input node with the “type” adjusted to “file” so the user can input a file through the pull down menu (note: the input node defaults to text inputs). Ensure you check off "Process Files into Text". This is because a semantic database doesn't accept data of type "file" but takes in "text" instead.

  2. Connect the file loader to a semantic search node (through the "documents" edge) to load the file into the database and allow for database queries.

Step 3 - Breaking Down The Logic

Create an engine that breaks down the user questions into sub-questions to allow for greater logical analysis.

  1. We create another Input node. This input node accepts “text” (which is the default) and allows the user to submit a question. We connect this node to the “prompt” of the LLM.

  2. For the system prompt, we prompt the LLM to break down the question into three additional ones. Within the prompt, we specify the exact output structure of the question (in this case, separate each of the new questions into new lines). In the below case, we write the system prompt in a Text node and connect it to the “system” edge of the LLM (you can write system prompts in a text node and connect it to the "system" node or you can write system prompts directly within the system field).

  3. We then pass the output of the LLM into a custom data Transformation that we wrote in Python. This takes a string of questions in the specified output structure (in this case, separated by new lines) and breaks them down individually.

  4. We pass these questions as a query into the semantic search node from above to look for relevant information.

Step 4 - Use Second LLM

Finally, we use a second LLM to receive the information queried from the sub-questions, synthesize it, and generate an output.

  1. Connect the output of the semantic search node to a variable called / labeled as “Context” in a text node that will be used as a prompt for the second LLM.

  2. Connect the Input node for user questions to a variable called / labeled as "Task" within the text node as well so the second LLM (described below) will know what the task is.

  3. Connect the Text node as the prompt into a second LLM.

  4. System Prompt: Create a Text node to instruct how the second LLM should behave. Connect to the "System" node of the second LLM. In this case, we instruct the LLM to complete the Task given Context.

  5. Connect an Output node to the output node of the LLM.

Last updated