Avoid These Common User Testing Biases

The last time I facilitated a design sprint, I was the moderator for user testing the prototype. Even though I had moderated many times before, I was surprised by a novice impulse—I wanted the prototype to receive positive feedback. This impulse came despite knowing that some of the most critical findings come from an awkward moment, a frustrated user, or confusion over how a product works and its value.

The design sprint is not meant to be perfect, but rather fast and productive. However, if we wish to proceed scientifically, we need to ask: Did we unconsciously encourage a positive reaction as moderators? More often than not, the same team that works on a product also works on its validation afterward. How does this influence user testing?

Here are some of the most common testing biases to be aware of if you want your business to succeed. These biases are not limited to a design sprint context; you should be mindful of them anytime you conduct user testing.

Recall bias

People are bad at remembering, so make it easy.

Interviews are probably the most basic research instrument for UXers. They are a common introduction to user testing, and yet, they rely entirely on a participant’s recall. Cognitive psychologists have known for a long time that memory is if anything, inaccurate. The following are good practices to help lessen the effects of this bias:

During the script creation, add questions the participant will be able to answer easily.

Four ingredients that make memories retrievable are:

Recency — Things that happened recently
Frequency — Things that happened many times
Emotion — Things that are emotionally charged for the participant
Association — Things that are not isolated, but are part of other topics, people, and experiences for the participant

During the moderation, talk slowly and don’t be afraid of silence. StatisticsHowTo suggests that remembering takes time, so participants will provide more accurate answers if they feel they can stop for a second and recall comfortably.

Expectation bias

Your mind will show you more of what you want to see, so try to bring unpolluted minds into the room.

This is what was happening to me. Luckily, I became aware of it. It happens when observers make partial measures or notes so that they confirm their expectations or desires. As noted in Williams’ et al. psychiatry paper, the following techniques can be employed to reduce this bias:

Assign observers and moderators who are independent of your project team (But preferably also UXers).
If that is not possible, include observers who don’t know specifics about your platform and the things you’re testing so they are not influenced by your hypotheses.
If that is not possible, consider different moderators/observers from your team, blinded to some phases of your project so they won’t be influenced by preconceptions.

Membership bias

A popular person is not necessarily a representative test participant, so make an effort to recruit properly.

It happens when the group you’re testing differs systematically from the population you’re studying. A common example in our industry goes as follows: Your clients consider, let’s say, customer support managers, as part of their user profile. Then they ask their area directors to recommend people for a study. Directors consider their direct reports and inevitably recommend the most popular ones—often the most outspoken or the most rational. You not only end up with a group of people who are customer support managers, but also who share other characteristics that might influence your results. These are my suggestions to lessen this distortion:

Be as involved as possible in the screening process. Time permitting, provide a screening questionnaire so you’re able to pick from different candidates and have a group with various points of view. Common things to ask in a screening questionnaire include how often they use certain devices or apps, their age, and—if it’s internal—how long they have been in the company.
Make it clear to your clients how important it is that participants be somewhat random, and not just their friends, or those who are happy to participate.
Recommend that your clients provide compensation to their participants. This way, participants feel their time is valued and giving feedback isn’t just a favor to the company. The profile sample will be more diverse when clients entice more than the usual volunteers.

The most prevalent one: Hawthorne Effect

People don’t behave equally if seen, so make it as comfortable as possible for them to be natural.

This is closely related to social desirability bias; this distortion prevails when your participants behave differently because they know they are being observed. They’ll act more analytical, read your platform in more detail, or be a bit more positive about their opinions than when using your product alone.

An obvious answer to avoiding this distortion would be to observe your participants without them knowing. However, this is not an ethical practice; if you’re recording or streaming them, they should know it.

Reduce as many possible judgmental questions from your script. People don’t want to look silly, so avoid phrasing your questions in terms of capability or easiness. If you want to track ratings about these attributes, consider asking about them in a separate questionnaire, possibly after the thinking-aloud exercise.
Before starting the test, you should state that you’re testing the platform and not the participant.

Being aware of these biases makes us better researchers and workshop facilitators because we talk about our findings conservatively. It allows us to discuss the risks of jumping to conclusions too quickly, especially if you have participants who are skeptical of your methods. Demonstrate that you are aware of the pros and the cons of what you are doing. That is an ultimate sign of professionalism.