How to build your Brand’s Digital Twin or Synthetic Persona

Oct 31, 2024

Listen to the audio version here

Imagine having an AI avatar that thinks, feels, and talks like your customers. You could ask it questions all the time. Anyone in your organization could. You could better understand what's on its mind, you could ask if it would like a product idea or an ad. Because this promise is so powerful, many companies are piloting digital twins or synthetic personas right now. But despite the euphoria, it turns out not to be that easy.

There is a simple reason that makes it unlikely that we will see near-term avatars that resemble our customers perfectly: The large language model technology we use works in a fundamentally different way than the human brain.

For a long time, science believed that language was a tool for humans to think. If this would be the case, LLM could learn the mechanics of thinking. But this has been proven wrong in recent studies (see this Nature article: http://www.nature.com/articles/s41586-024-07522-w). Language is only a tool to communicate the result of thought processes that are themselves much more complex, that emerge in parallel processes and specialized regions in the brain, that are connected with experiences, with a body, and with a social context.

Self-driving car systems use similar deep learning technology to LLM and are probably the most data-intensive field in the world, with trillions of data sets. However, this is still not enough to achieve driving autonomy - outside of San Francisco. A human needs only a couple of driving lessons to accomplish the same task. This makes it clear that human intelligence is still very different from artificial intelligence.

It takes more than a ChatGPT login and a lot of data

It is simply not known when AI will be able to perfectly simulate human intelligence. But artificial intelligence EXCEEDS human intelligence in many specialized aspects and applications already today.

In this sense, we may not be able to clone our customers perfectly, but we may be able to produce clones that are even BETTER for research than real humans.

Likewise, since AI today is logically incapable of perfectly replacing humans, we must pull out all the stops and go beyond naive prompting or brute force learning to develop AI that will still be useful beyond showroom toy examples.

Here are the three steps needed to make it work.

Make it THINK like your customer

When we ask "why did you rate that way" in an NPS survey, what customers answer are just top-of-mind associations and have little to do with what really drives them. Deep causal introspection is not easy for us. Loudspeaker customers say 'because of the great sound', restaurant customers say 'because it tastes good', and washing machine customers say 'because it washes well'. It turns out that none of these are the key drivers of customer behavior.

What this means for digital agents: As researchers, we're less interested in what customers "would say" than in what thoughts and attitudes really drive their behavior. We would like to talk to a customer (avatar) who is fully aware of his or her thoughts and behaviors, so we can interview them and expect unbiased answers.

To get these unbiased results, it turned out to be naive to simply train the LLM (e.g. using a RAG approach) by taking descriptive data from past surveys, which at best allows the AI to reproduce the biased data without self-introspection.

Instead, we want to teach the machine what is relevant to customers. What moves them and what is just talk. We must first understand the causal drivers of thought and behavior. In technical fields it is becoming more popular to feed LLMs with so-called KnowledgeGraphs. Experts realized that LLM are missing the logic of the inner mechanics of the reality. In Marketing we call this inner mechanics a causal network. You build it with Causal AI software. For instance has SUPRA CAUSAL AI software now an output format designed to feed LLM training.

In other words, what we need to do is to understand what drives category buyers in our market by running a study that is analyzed with Causal AI. We can add other studies, for example, on the mechanics of certain product adoptions, on brand architectures, on touchpoint effectiveness, and the like. The more granular and causal, the better. But we need to start with causal insights, not data.

Data is the surface, causality is the inner mechanics. Data is the talk, causality is what makes people talk and behave a certain way.

Make it FEEL like your customer

Imagine we want to better understand a wireless speaker user. We have already fed the system with causal insights that tell us that increasing sound quality does not drive much loyalty, but increasing reliability of operation does.

If we now ask the AI how “reducing sound quality to make the speaker cheaper” would appeal, it might respond, "That's a good idea."

But your causal insights that you used for training not always tells the full story. Sound quality might not be a driver to amaze customers because it is good enough. But it might be devastating to largely cut on it. These insights is out of the data space of the causal analysis.

This is why digital twins need more. We need to give it a “gut feeling” or “intuition”. We do this by incorporating foundational frameworks about people's inherent motivations.

Dirk Ziems & Steffen Schmidt from Concept M recently introduced a powerful approach to training AI on the fundamentals of morphological psychology. The idea is that all humans are driven by basic motives such as security, autonomy, stimulation, affiliation, recognition, and control.

With this insights, the twin will realized that decreasing sound quality will cut the ultimate reason of a sound system: stimulation.

To incorporate such insights, it requires to conduct a qualitative study to create the morphologic profile of the domain. Based on this it is possible to create a training text document for LLMs. Such a document describes how the basic motives translate into the category at hand.

Make it TALK like your customer

The text on which LLMs are trained is written language. As anyone can attest, this can be very different from spoken language. Most customers rarely write, and when they do interact with brands, they either speak or use spoken language when they write.

If you want a digital twin to talk like your customers, you need to train it with the syntax and rules of spoken language. This will help it learn syntactic and semantic understanding.

Having the right tone of voice will make interacting with the avatar feel much more real and users of the AI will give more credibility to the response.

Your Digital Twins Roadmap

Large language-based systems are getting better and better. But they still work on a surface level by design. They learn how things relate to each other on level of language. But thinking happens without language on deeper, often unconscious levels.

We want a digital twin to respond like a real customer. It is not enough to train it with surface level data (customer feedback, survey data). If overdone, such a surface level data training can distract the model from what's really important. Instead,

we need to train it with knowledge of underlying, invisible, and often unconscious preferences (which we find through causal inference).
we need to train it with basic psychological operating principles based on basic motives.
we need to teach it the syntax and semantics of spoken language to make the output relatable and believable.

From persona to digital twin panels

There is no such thing as "your customer”. Every customer is different.

There is no such thing as a customer "persona" or customer "segment". These are illusions created by humans to make things easier.

Sure, building personas can be an economical approach. But if we start to forget that they are just illusions, we are not only fooling ourselves, we are stopping progress.

If at all possible, you don't want a digital persona, you want a digital twin panel - an army of twins designed to mimic the complexity of your customer base.

Collect a representative sample of customers profiled with relevant information about that customer or prospect. Then you can customize the digital twin model you have built to become the twin of a specific customer or prospect.

Surveying this twin panel will be more realistic and predictive than any reductionist persona-based approach.

This is how you 10x with digital twins

What good are these digital twins now? If they are so complex to build, do they make economic sense?

Here are 5 reasons why you need to build your digital twin panel and do it in 2025, not later:

Digital twins answer your surveys anyway: Just last week, Claude launched the "Computer Use" feature, which can answer surveys on behalf of users. The fight against survey fraud will get harder and harder. Surveys will become more expensive and less reliable.
Replace the guesswork: Ad hoc questions that businesses have can now be answered instantly by turning to your digital twin panel. Once companies get used to this, a culture of always asking questions rather than relying on expertise develops. We always want to rely on expertise. But when that is the only mode we have, most expertise turns out to be just guesswork.
The standard mode of research will be based on digital twins: While the first use will be qualitative research and qualitative interaction with digital twins, the second use will be quantitative research: Your digital twins will answer your survey and generate quantitative statistics. In minutes, not weeks.
"Real" research will not die, but it will change fundamentally: It will become high-tech, highly specialized. Instead, traditional market research will decay very quickly. Just like when you log into your bank account, we will make sure we are talking to a real person. We will pay more for it. We will do more high quality in-depth interviews, and when we do quant, we will run the data through causal inference methods.
The quality of insights will increase: How can asking a machine be better than asking a human? The kind of digital twins proposed require deep insights based on causal inference and deeper psychological insights. Because companies will be forced to do proper research, what we get will be better than what we had without these deep dives. Building on these deeper insights, we can now build digital twins that are MORE aware of why people act and want in certain ways than the real people are. As a result, extracting insights from the digital counterpart becomes easier and less biased.

THIS is how you 10x your insights.

How to build your Brand’s Digital Twin or Synthetic Persona

Discussion about this post