AI-generated synthetic users offer a compelling glimpse into the future of market research.


Deckard: She’s a replicant, isn’t she?
Tyrell: I’m impressed. How many questions does it usually take to spot them?
Deckard: I don’t get it, Tyrell.
Tyrell: How many questions?
Deckard: Twenty, thirty, cross-referenced.
Tyrell: It took more than a hundred for Rachael, didn’t it?

Blade Runner, 1982

Sci-fi or strategy?

Androids and synthetic humans proliferate throughout science fiction. We are continually fascinated by the question of artificially generated empathy — can humans build something that thinks and feels like we do? And can we tell the difference?

In Blade Runner, the Voight-Kampff test distinguishes human from replicant.

And while we’re not in need of a Voight-Kampff test quite yet, distinguishing user-generated and AI-generated content is one of the most pressing questions of modern times. Generative AI has a writing style that is usually recognisable at first glance. AI images and videos hold up to even less scrutiny. But it may soon become impossible to tell human and AI-generated content apart. 

Synthetic users are the next evolution of this. When AI and human output are indistinguishable, it becomes possible to create coherent and consistent simulations that talk and behave in a way that is impossible to tell apart from a human. No longer just offering reactive content, AI will be able to proactively simulate people — synthetic users – and we will be able to interact with them.

Serious questions then arise: is what these synthetic users have to say useful, or meaningful? We know they will be able to convincingly relay emotions, experiences, and knowledge. But does the fact that they aren’t drawn from an individual experience undermine their validity? Worse, could they reinforce existing biases and stereotypes?

These questions aren’t just academic. If we can accurately simulate human behaviour and decision-making, it will revolutionise research, and how we test, understand, and predict reactions. This would supercharge the effectiveness of design and have benefits society-wide.


From synthetic data to synthetic users

Synthetic data has been used for decades to represent the patterns of real-world data without privacy or scarcity concerns. It can retain the underlying statistical properties of the original data and thus can supplement or even replace real datasets.

For data that has well-known distributions, correlations, and traits, simulation through mathematical models can create new data points. In sectors where data is in limited supply or is particularly difficult or costly to gather — such as finance and healthcare — synthetic data is an excellent solution. It is also a way to protect sensitive data, such as medical history, while still conducting valuable research and analysis.

However, the definition and application of synthetic data has broadened significantly with the introduction of LLMs and generative AI. It can now be multimedia: videos, images, and text. This expansion of capabilities now also allows for the creation of convincing synthetic users. 

Briefly defined, a synthetic user is an AI-generated persona that is intended to mimic human responses, behaviour, and experiences. Global market researchers are applying this in a multitude of ways, most notably user experience, early-stage innovation, brand research, and go-to-market research.

Someone already doing so is Booking.com. They used synthetic users to simulate harder-to-reach subgroups for their annual Travel Trends report. In doing so, they uncovered a nuance in the concept of ‘solo travel’ after noting a gap between human and synthetic responses. Their research team re-examined the data and discovered that about half of the human respondents had misinterpreted the question. 

Solo travellers walk a lonesome road. And also aren’t so great at reading comprehension.

By comparing synthetic and human responses, they were able to improve their insight into the habits and preferences of solo travellers. The interactivity of synthetic users opened completely new possibilities vs traditional synthetic data.


We are most human at the fringes

At Sense Worldwide, we pioneered the use of cognitive diversity to seek out unexpected perspectives to deliver breakthrough results. The Sense Network allows us to work with lead consumers, extreme users, niches, and target customers to co-create and stress-test ideas. We have always placed a big emphasis on the power of creativity and human perspectives.

It’s led us to some incredible success stories:

The Sense Network provides us with outsiders and creative thinkers, so why are we so interested in synthetic users? The answer is that generative AI can give us something we don’t look for: the mainstream. Synthetic users are mostly used to augment, or in some cases replace, general population studies. This is because the more niche an audience is, the harder it is to mimic convincingly.

AI is great at finding patterns and synthesising large datasets, but it cannot meaningfully offer insight without sufficient data to draw upon. Synthetic users allow us to supplement our lead user research with ‘typical’ views as a point of comparison. And even more intriguingly, with our deep reserve of fringe user data, we can train synthetic users on much more unusual and creative profiles than others can.

We’re only just beginning to find out what that could mean.


Putting synthetic users to the test

We like to play with AI techniques as soon as we can get our hands on them, and synthetic users are no exception. With a vast amount of qualitative data at our disposal, we had no shortage of training data to get started with. Using it, we’ve deployed synthetic users to aid with ideation, early-stage concept feedback, hypothesis generation, and even as a new way of interacting with a consumer profile. 

Multimodal AI models can examine and feedback on UIs, images, and even videos.

During our earliest efforts, we were impressed by the speed and coherence of the results. However, the models struggled with context, and there was overlap, repetition, and a lack of novelty in the output. It did not produce anything on the level of our human network. 

But the capabilities quickly evolved, and we kept exploring. We learned what tools worked, and when to use them. As the models improved, we were able to deploy them in increasingly advanced ways.

For one of our clients — a luxury automotive giant — we worked on a project to develop a robust target consumer profile. This was a wholly human process. But once we had the data, we were also able to develop a highly detailed synthetic user based on this consumer profile. It was possible to interact with the synthetic user via chat, on the phone, or even Zoom. It brought the profile to life and into the room in an entirely new way. Image and audio recognition meant we were able to test product ideas, advertisements, and stimuli on-demand to see how they resonated.

How did they perform?

The synthetic user developed throughout the project as it gained context, being able to draw all previous information from memory with perfect recall. We went back and forth comparing the synthetic insights to the human insights from early testing right through to the final stages.

Ironically, despite the reputation of AI as sycophantic, our synthetic user was consistently more sceptical and critical – pointing out what it thought was cliché or contrived. However, it was unable to pick up on specific and granular details in the same way as our human respondents. It also was not able to make reference to outside experience or observations, for obvious reasons.

Overall, the synthetic feedback aligned well with the feedback we got from our human respondents. It acted more or less as a representation of the majority opinion. It also offered significant speed and volume – we could ask as many questions as we wanted, when we wanted.


Is the future synthetic?

Right now, synthetic users are somewhere between a curiosity and a convenience, and we’re not seeing them in mainstream usage quite yet. But LLMs are continually lowering the barrier to entry and changing what is possible. For example, it has become far easier to re-run scenarios under slightly altered conditions and observe immediate behavioural changes. Historically, this type of granular analysis happened post-launch in A/B tests — now it can happen much earlier. 

This also highlights perhaps the clearest benefit of synthetic users: their infinite iterability. Cost, timelines, scale, or survey fatigue – classic obstacles for normal market research – become non-issues. However, for all their convenience, synthetic users can’t yet match the depth and empathy of human conversations. They can scale, but they can’t surprise. 

Still, as part of a research toolkit, they are likely to soon become indispensable. And the future offers even more enticing propositions:

  • Agentic systems: Current synthetic users are reactive. But imagine a synthetic user that is able to pipe up when something catches their interest or proactively ask questions. Such ‘agentic users’ could be used to simulate the step-by-step ‘experience’ of a product or service, mimicking valuable live feedback.

  • Synthetic networks: With examples of this showing up already, it will become possible to put synthetic users ‘in a room’ with each other so that they can hold debates, compare experiences, and respond to one another each other – like a workshop session. There is even scope fo this to function as a hybrid activity between synthetic and human users. 

Advances like these will create research environments that are continuous, interactive, and fully scalable – where you can ask anything, anytime, to anyone. For those of us in the business of asking questions, that makes for an exciting prospect.