Friday, March 25, 2011

Testing Content


Testing Content

Testing Content
Nobody needs to convince you that it’s important to test your website’s design and interaction with the people who will use it, right? But if that’s all you do, you’re missing out on feedback about the most important part of your site: the content.
Whether the purpose of your site is to convince people to do something, to buy something, or simply to inform, testing only whether they can find information or complete transactions is a missed opportunity: Is the content appropriate for the audience? Can they read and understand what you’ve written?

A tale of two audiences

Consider a health information site with two sets of fact sheets: A simplified version for the lay audience and a technical version for physicians. During testing, a physician participant reading the technical version stopped to say, “Look. I have five minutes in between patients to get the gist of this information. I’m not conducting research on the topic, I just want to learn enough to talk to my patients about it. If I can’t figure it out quickly, I can’t use it.” We’d made some incorrect assumptions about each audience’s needs and we would have missed this important revelation had we not tested the content.

You’re doing it wrong

Have you ever asked a user the following questions about your content?
How did you like that information?
Did you understand what you read?
It’s tempting to ask these questions, but they won’t help you assess whether your content is appropriate for your audience. The “like” question is popular—particularly in market research—but it’s irrelevant in design research because whether you like something has little to do with whether you understand it or will use it. Dan Formosa provides a great explanation about why you should avoid asking people what they like during user research. For what’s wrong with the “understand” question, it helps to know a little bit about how people read.

The reading process

Reading is a product of two simultaneous cognitive elements: decoding and comprehension.
When we first begin to read, we learn that certain symbols stand for concepts. We start by recognizing letters and associating the forms with the sounds they represent. Then we move to recognizing entire words and what they mean. Once we’ve processed those individual words, we can move on to comprehension: Figuring out what the writer meant by stringing those words together. It’s difficult work, particularly if you’re just learning to read or you’re one of the nearly 50% of the population who have low literacy skills.
While it’s tempting to have someone read your text and ask them if they understood it, you shouldn’t rely on a simple “yes” answer. It’s possible to recognize every word (decode), yet misunderstand the intended meaning (comprehend). You’ve probably experienced this yourself: Ever read something only to reach the end and realize you don’t understand what you just read? You recognize every word, but because the writing isn’t clear, or you’re tired, the meaning of the passage escapes you. Remember, too, that if someone misinterpreted what they read, there’s no way to know unless you ask questions to assess their comprehension.
So how do you find out whether your content will work for your users? Let’s look at how to predict whether it will work (without users) and test whether it does work (with users).

Estimate it

Readability formulas measure the elements of writing that can be quantified, such as the length of words and sentences, to predict the skill level required to understand them. They can be a quick, easy, and cheap way to estimate whether a text will be too difficult for the intended audience. The results are easy to understand: many state the approximate U.S. grade level of the text.
You can buy readability software. There are also free online tools from Added BytesJuicy Studio, and Edit Central; and there’s always the Flesch-Kincaid Grade Level formula in Microsoft Word.
But there is a big problem with readability formulas: Most features that make text easy to understand—like content, organization, and layout—can’t be measured mathematically. Using short words and simple sentences doesn’t guarantee that your text will be readable. Nor do readability formulas assess meaning. Not at all. For example, take the following sentence from A List Apart’s About page and plug it into a readability formula. The SMOG Indexestimates that you need a third grade education to understand it:
We get more mail in a day than we could read in a week.
Now, rearrange the words into something nonsensical. The result: still third grade.
In day we mail than a week get more in a could we read.
Readability formulas can help you predict the difficulty level of text and help you argue for funding to test it with users. But don’t rely on them as your only evaluation method. And don’t rewrite just to satisfy a formula. Remember, readability formulas estimate how difficult a piece of writing is. They can’t teach you how to write understandable copy.

Do a moderated usability test

To find out whether people understand your content, have them read it and apply their new knowledge. In other words, do a usability test! Here’s how to create task scenarios where participants interpret and use what they read:
  • Identify the issues that are critical to users and the business.
  • Create tasks that test user knowledge of these issues.
  • Tell participants that they’re not being tested; the content is.
Let’s say you’re testing SEPTA, a mass transit website. It offers several types of monthly passes that vary based on the mode of transportation used and distance traveled: For example, a TransPass lets you ride on the subway, bus or trolley. A TrailPass also lets you ride the train, etc. If you only wanted to test the interface, you might phrase the task like this:
Buy a monthly TrailPass.
But you want to test how well the content explains the difference between each pass so that people can choose the one that’s right for them. So phrase your task like this:
Buy the cheapest pass that suits your needs.
See the difference? The first version doesn’t require participants to consider the content at all. It just tells them what to choose. The second version asks them to use the content to determine which option is the best choice for them. Just make sure to get your participants to articulate what their needs are so you can judge whether they chose the right one.
Ask participants to think aloud while they read the content. You’ll get some good insight on what they find confusing and why. Ideally, you want readers to understand the text after a single reading. If they have to re-read anything, you must clarify the text. Also, ask them to paraphrase some sections; if they don’t get the gist, you’d better rewrite it.
To successfully test content with task scenarios and paraphrasing, you’ve got to know what the correct answer looks like. If you need to, work with a subject matter expert to create an answer key before you conduct the sessions. You can conduct live moderated usability tests either in person or remotely. But, there are also asynchronous methods you can use.

Do an unmoderated usability test

If you need a larger sample size, you’re on a small budget, or you’re squeezed for time, try a remote unmoderated study. Send people to the unmoderated user testing tool of your choice like Loop11 or OpenHallway, give them tasks, and record their feedback. You can even use something like SurveyMonkey and set up your study as a multiple-choice test: It takes more work up front than than open-ended questions because you must define the possible answers beforehand, but it will take less time for you to score.
The key to a successful multiple-choice test is creating strong multiple choice questions.
  • State the question in a positive, not negative, form.
  • Include only one correct or clearly best answer.
  • Come up with two–four incorrect answers (distractors) that would be plausible if you didn’t understand the text.
  • Keep the alternatives mutually exclusive.
  • Avoid giving clues in any of the answers.
  • Avoid “all of the above” and “none of the above” as choices.
  • Avoid using “never,” “always,” and “only.”
You may also want to add an option for “I don’t know” to reduce guessing. This isn't the SATafter all. A lucky guess won’t help you assess your content.
Task scenario:
You want to buy traveler's checks with your credit card. Which percentage rate applies to the purchase?
Possible answers:
  • The Standard APR of 10.99%
  • The Cash Advance APR of 24.24%*
  • The Penalty APR of 29.99%
  • I don’t know
(*This is the correct answer, based on my own credit card company’s cardmember agreement.)
As with moderated testing, make it clear to participants that they’re not being tested, the content is.

Use a Cloze test

Cloze test removes certain words from a sample of your text and asks users to fill in the missing words. Your test participants must rely on the context as well as their prior knowledge of the subject to identify the deleted words. It’s based on the Gestalt theory of closure—where the brain tries to fill in missing pieces—and applies it to written text.
It looks something like this:
If you want to __________ out whether your site __________ understand your content, you __________ test it with them.
It looks a lot like a Mad Lib, doesn’t it? Instead of coming up with a sentence that sounds funny or strange or interesting, participants must guess the exact word the author used. While Cloze tests are uncommon in the user experience field, educators have used them for decades to assess whether a text is appropriate for their students, particularly in English-as-an-additional-language instruction.
Here’s how to do it:
  • Take a sample of text—about 125-250 words or so.
  • Remove every fifth word, replacing it with a blank space.
  • Ask participants to fill in each space with the word they think was removed.
  • Score the answers by counting the number of correct answers and dividing that by the total number of blanks.
A score of 60% or better indicates the text is appropriate for the audience. Participants who score 40-60%, will have some difficulty understanding the original text. It’s not a deal breaker, but it does mean that the audience may need some additional help to understand your content. A score of less than 40% means that the text will frustrate readers and should be rewritten.
It might sound far fetched, but give this method a try before you dismiss it. In a government study on healthcare information readability, an expert panel categorized health articles as either easy or difficult. We ran a Cloze test using those articles with participants—who had low to average literacy skills—and found that the results reflected the expert panel’s findings. The average score for the “easy” version was 60, indicating the article was written at an appropriate level for these readers. The average score for the “difficult” version was 39: too hard for this audience.
Cloze tests are simple to create, administer, and score. They give you a good idea as to whether the content is right for the intended audience. If you use Cloze tests—either on their own or with more traditional usability testing methods—know that it takes a lot of cognitive effort to figure out those missing words. Aim for at least 25 blanks to get good feedback on your text; more than 50 can be very tiring.

When to test

Test your content at any point in your site development process. As long as you have content to test, you can test it. Need to convince your boss to budget for content testing? Run it through a readability formula. Got content but no wireframes or visual design? Run a Cloze test to evaluate content appropriateness. Is understanding the content key to a task or workflow? Display it in context during usability testing.

What to test

You can’t test every sentence on your site, nor do you need to. Focus on tasks that are critical to your users and your business. For example, does your help desk get calls about things the site should communicate? Test the content to find out if and where the site falls short.

So get to it

While usability testing watches what users do, not what they say they do, content testing determines what users understand, not what they say they understand.
Whatever your budget, timeline, and access to users, there’s a method to test whether your content is appropriate for the people reading it. So test! And then, either rest assured that your content works, or get cracking on that rewrite. 
Translations:
Italian
Russian

    No comments:

    Post a Comment

    Note: Only a member of this blog may post a comment.