๐Ÿš€
Prompting Techniques
DSP
Article Header Backdrop
Engineering

Directional Stimulus Prompting (DSP) ๐ŸŽฏ

Master the technique of using a small, tunable policy model to generate 'hints' that guide a larger, frozen LLM toward specific desired outputs like accurate summarization.

Mar 20265 min read
๐ŸŒ
References & Disclaimer

This content is adapted from Prompting Guide: DSP. It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.

Introduction

Getting a large language model to generate a specific style or tone of output can be challenging with standard prompts. Directional Stimulus Prompting (DSP), proposed by Li et al. (2023) (opens in a new tab), introduces a two-model system to bridge this gap.


How DSP Works

The core idea of DSP is to use a small, tunable policy LM to generate a "stimulus" or "hint" for every input. This stimulus is then appended to the original prompt to guide a much larger, frozen black-box LLM (like GPT-4).

DSP Framework comparison Image Source: Li et al. (2023)

The Policy Model

The policy model can be relatively small compared to the target LLM. It is optimized using Reinforcement Learning (RL) to learn how to generate the most effective hints that result in the highest quality output from the frozen model.


Why it's Effective

DSP offers a middle ground between simple prompting and full model fine-tuning:

  1. Efficiency: You only need to tune the small policy model, while the massive target LLM remains frozen.
  2. Precision: The "stimuli" act as directional anchors, ensuring the model doesn't drift during complex tasks like long-form summarization.
  3. Adaptability: The policy model can be quickly re-optimized for different tasks or styles without touching the core LLM's weights.

๐Ÿ’ก

Example Use Case: In meeting summarization, a policy model might extract specific "key action items" as tokens and pass them as a stimulus. The larger LLM then uses these tokens to ensure the final summary is grounded in the most important parts of the transcript.


[!TIP] Directional Stimulus Prompting is part of a growing trend of using "LLMs to guide LLMs." To see how this concept evolves into fully automated instructions, explore Automatic Prompt Engineer (APE) next.

ยฉ 2026 Driptanil Datta. All rights reserved.

Software Developer & Engineer

Disclaimer:The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP:Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

Built with Love โค๏ธ | Last updated: Mar 16 2026