What does a real dialogue with LLMs look like?

This pook looks at a potential future paradigm for LLM interactions.

Introduction

Chat sessions are the main way of currently interacting with LLMs.

These are discrete, turn-based interactions that are all driven by the user.

The user has an idea, or concept, in their head that they would like to realize with the LLM's help. This idea does not exist on an island, it's baked into a much larger context in the user's mind.

Then they have to describe this concept to the LLM. Both the idea itself and the surrounding context that bore it must be clear enough for the LLM to meaningfully solve the user's task.

For simple requests describing the idea is not difficult. When there is a clear outcome that can be done in one step, it is often enough to imperatively ask the LLM to simply take the action.

Dealing with complex tasks

But when the task is longer or more complex, this discrete chat-based interface becomes limiting.

The user starts with a blank text box, and a mountain of context that must be organized then laid out for the model.

This first message is very top-heavy, if the user could "copy-paste" the surrounding implicit context of their idea into the LLM's context window, then there'd be a much stronger starting point and the work could easily start. But these models are far away from being mind readers. And communication is very hard with people, let alone machines.

A blank slate, every time

The user and model start from this initially shared blank slate. As a helping hand, the model's system prompt has ideally placed it in a generically useful spot in problem space for completing requests.

The user's complete, ideal output exists somewhere away from this generically, initialized point. The first message points it away toward the ideal output. Then, the discrete turns dance begins.

For the rest of the interaction, until the request is completed or the user gives up, each input -> response pair represents a discrete step towards the user's final, desired output.

On handling chat-based limitations

By the nature of discrete messages, these steps are large and noisy. This can be made clear by asking the model "Ask me any clarifying questions about things that are unclear, or things that would help to meaningfully improve your output." The model's response to this question often reveals a host of hidden and incorrect assumptions that it's been operating under. The success and power of this mid-session clarification is an attempt to overcome the noisiness of the discrete, turn-based interactions.

Chat sessions become a working trajectory from the System Prompt's initial position to the user's desired outcome. A naive user who does not care about the clarity of their presentation will make for incredibly noisy, unclear steps. A user who is clear, and/or very familiar with the model's tendencies and impulses, will be able to take a much clearer and smoother path to their desired outcome.

In fact, when users have used a certain model enough, they implicitly learn to handle the mistakes and hidden assumptions that model's inevitably make from not having access to a pure copy of the user's mind and context. If a user is actively trying to learn and improve the outputs they get from a model, they can take smoother and smoother paths to get their desired outcomes.

Once again, like the "Ask me clarifying questions..." approach, learning the model is another way of trying to overcome the limits of discrete, turn-based interactions.

A future paradigm for LLM interactions

We need a new paradigm for LLMs that breaks the limiting discrete, turn-based interactions.

To break the discreteness, we need the models to process a continuous stream of thoughts.

To break the turns, we need a natural conversational flow.

This requires the models to have an inner notion of the conversation, in order to know when there is enough useful information to make a meaningful contribution.

To deal with complex ideas beyond the current model's capabilities we need collaborators, not assistants. I argue it is not the model's themselves that are fundamentally limited, it is mainly our way of using them.

With a continuous, conversational flow of thoughts, interacting with the models becomes like a good conversation with a colleague who's motivated and interested in the same problems.

Instead of taking large, plodding steps towards the target outcome, we approach it smoothly following the continuous stream of thoughts.

Done well, these continuous models will even help the user refine and clarify his or her ideas.

Conclusion

In short, moving away from discrete, turn-based interactions to a conversational flow of ideas is one of the most impactful ways we can move LLMs away from hobbled, half-blind assistants into true colleagues and collaborators.