Engineering

ReAct Prompting (Reasoning and Acting) 🎭

Explore the framework that enables LLMs to interleave reasoning traces and task-specific actions, allowing them to use external tools and knowledge bases for more reliable and factual responses.

Driptanil DattaSoftware Developer

Mar 202612 min read

🌍

References & Disclaimer

This content is adapted from Prompting Guide: ReAct. It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.

Introduction

Chain-of-Thought (CoT) prompting has shown the capabilities of LLMs to carry out reasoning traces. However, its lack of access to the external world can lead to fact hallucinations and error propagation.

ReAct, proposed by Yao et al. (2022) (opens in a new tab), is a general paradigm that combines reasoning and acting with LLMs. It prompts LLMs to generate verbal reasoning traces and actions for a task in an interleaved manner.

How it Works

ReAct is inspired by the synergies between "acting" and "reasoning" that allow humans to learn new tasks and make decisions.

The framework allows the system to:

Induce and Update Plans: Using reasoning traces to track action progress.
Interface with External Sources: Using action steps to gather information from knowledge bases (e.g., Wikipedia) or tools.

ReAct Framework Image Source: Yao et al. (2022)

The Thought-Action-Observation Loop

In a typical ReAct trajectory, the model follows a structured cycle:

Thought: The model's reasoning about the current state.
Action: The specific tool or search query to execute.
Observation: The result returned from the external environment.

Knowledge-Intensive Tasks

ReAct is particularly effective for question answering (HotPotQA) and fact verification (Fever). By retrieving information to support reasoning, and using reasoning to target what to retrieve next, the model becomes significantly more factual.

ReAct Performance Table Image Source: Yao et al. (2022)

Example Trajectory (HotPotQA)

Question: Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?

Thought 1: I need to search Apple Remote, find the program it was originally 
          designed to interact with, and then find other devices that can 
          control that program.
Action 1: Search[Apple Remote]
Observation 1: The Apple Remote is a remote control... designed to interact 
               with the Front Row media center program.
Thought 2: The program Apple Remote was designed to interact with is Front Row. 
          I need to search Front Row and find what other devices can control it.
Action 2: Search[Front Row (software)]
Observation 2: Front Row is a discontinued media center software... it can be 
               controlled by the Apple Remote or the Keyboard Function keys.
Thought 3: The other devices that can control Front Row are the Keyboard 
          Function keys. I now know the final answer.
Action 3: Finish[Keyboard Function keys]

Decision Making Tasks

ReAct also excels in interactive environments like ALFWorld (text-based games) and WebShop (online shopping simulations). In these tasks, reasoning helps the model decompose complex goals into manageable subgoals.

ALFWorld Example Image Source: Yao et al. (2022)

Implementing a ReAct Agent with LangChain

Modern libraries like LangChain have built-in support for the ReAct framework, allowing you to create autonomous agents with just a few lines of code.

Setup and Configuration

from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent
 
llm = OpenAI(model_name="text-davinci-003", temperature=0)
tools = load_tools(["google-serper", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

Running the Agent

When you run a complex query like: "Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?", the agent enters the ReAct loop:

> Entering new AgentExecutor chain...
Thought: I need to find out who Olivia Wilde's boyfriend is and then calculate his age.
Action: Search
Action Input: "Olivia Wilde boyfriend"
Observation: Olivia Wilde started dating Harry Styles...
Thought: I need to find out Harry Styles' age.
Action: Search
Action Input: "Harry Styles age"
Observation: 29 years
Thought: I need to calculate 29 raised to the 0.23 power.
Action: Calculator
Action Input: 29^0.23
Observation: Answer: 2.169459462491557
Final Answer: Harry Styles is 29 years old and his age raised to the 0.23 power is 2.169...

🚀

ReAct + CoT: The authors found that the best approach often involves combining ReAct with Chain-of-Thought (CoT), allowing the model to use both internal knowledge and external information simultaneously.

[!IMPORTANT] While powerful, ReAct depends heavily on the quality of retrieved information. Non-informative search results can derail the model's reasoning. Always ensure your tools provide clean, relevant data.

PAL Reflexion