Planning a usability test
One of the first steps in each round of usability testing is to develop a plan for the test. The purpose of the plan is to document what you are going to do, how you are going to conduct the test, what metrics you are going to capture, the number of participants you are going to test, and what scenarios you will use.
Elements of a test plan
- Scope: Indicate what you are testing: Give the name of the Web site, Web application, or other product. Specify how much of the product the test will cover (e.g. the prototype as of a specific date; the navigation; navigation and content).
- Purpose: Identify the concerns, questions, and goals for this test. These can be quite broad; for example, “Can users navigate to important information from the prototype’s home page?” They can be quite specific; for example, “Will users easily find the search box in its present location?” In each round of testing, you will probably have several general and several specific concerns to focus on. Your concerns should drive the scenarios you choose for the usability test.
- Sessions: You will want to describe the sessions, the length of the sessions (typically one hour to 90 minutes). When scheduling participants, remember to leave time, usually 30 minutes, between session to reset the environment, to briefly review the session with observer(s) and to allow a cushion for sessions that might end a little late or participants who might arrive a little late
- Scenarios: Indicate the number and types of tasks included in testing. Typically, for a 60 min. test, you should end up with approximately 10 (+/-2) scenarios for desktop or laptop testing and 8 (+/- 2) scenarios for a mobile/smartphone test. You may want to include more in the test plan so the team can choose the appropriate tasks.
- Metrics: Subjective metrics: Include the questions you are going to ask the participants prior to the sessions (e.g., background questionnaire), after each task scenario is completed (ease and satisfaction questions about the task), and overall ease, satisfaction and likelihood to use/recommend questions when the sessions is completed.
- Quantitative metrics: Indicate the quantitative data you will be measuring in your test (e.g., successful completion rates, error rates, time on task).
Test metrics
- Successful Task Completion: Each scenario requires the participant to obtain specific data that would be used in a typical task. The scenario is successfully completed when the participant indicates they have found the answer or completed the task goal. In some cases, you may want give participants multiple-choice questions. Remember to include the questions and answers in the test plan and provide them to note-takers and observers.
- Critical Errors: Critical errors are deviations at completion from the targets of the scenario. For example, reporting the wrong data value due to the participant’s workflow. Essentially the participant will not be able to finish the task. Participant may or may not be aware that the task goal is incorrect or incomplete.
- Non-Critical Errors: Non-critical errors are errors that are recovered by the participant and do not result in the participant’s ability to successfully complete the task. These errors result in the task being completed less efficiently. For example, exploratory behaviors such as opening the wrong navigation menu item or using a control incorrectly are non-critical errors.
- Error-Free Rate: Error-free rate is the percentage of test participants who complete the task without any errors (critical or non-critical errors).
- Time On Task: The amount of time it takes the participant to complete the task.
- Subjective Measures: These evaluations are self-reported participant ratings for satisfaction, ease of use, ease of finding information, etc where participants rate the measure on a 5 to 7-point Likert scale.
- Likes, Dislikes and Recommendations: Participants provide what they liked most about the site, what they liked least about the site, and recommendations for improving the site.
Moderation Techniques
Some common moderating techniques include:
- Concurrent Think Aloud (CTA) is used to understand participants’ thoughts as they interact with a product by having them think aloud while they work. The goal is to encourage participants to keep a running stream of consciousness as they work.
- In Retrospective Think Aloud (RTA), the moderator asks participants to retrace their steps when the session is complete. Often participants watch a video replay of their actions, which may or may not contain eye-gaze patterns.
- Concurrent Probing (CP) requires that as participants work on tasks—when they say something interesting or do something unique, the researcher asks follow-up questions.
- Retrospective Probing (RP) requires waiting until the session is complete and then asking questions about the participant’s thoughts and actions. Researchers often use RP in conjunction with other methods—as the participant makes comments or actions, the researcher takes notes and follows up with additional questions at the end of the session.
Test limitations
Due to the very small sample size and because participation is not random, feedback will not be representative of your entire customer base, nor will any data gleaned be statistically significant. However, important insights can be gotten by leveraging usability tests.
Best Practices
- Treat participants with respect and make them feel comfortable.
- Remember that you are testing the site not the users. Help them understand that they are helping us test the prototype or Web site.
- Remain neutral – you are there to listen and watch. If the participant asks a question, reply with “What do you think?” or “I am interested in what you would do.”
- Do not jump in and help participants immediately and do not lead the participant. If the participant gives up and asks for help, you must decide whether to end the scenario, give a hint, or give more substantial help.
- The team should decide how much of a hint you will give and how long you will allow the participants to work on a scenario when they are clearly going down an unproductive path.
- Take good notes. Note-takers should capture what the participant did in as much detail as possible as well as what they say (in their words). The better the notes are that are taken during the session, the easier the analysis will be.
- Measure both performance and subjective (preference) metrics. People’s performance and preference do not always match. Often users will perform poorly but their subjective ratings are very high. Conversely, they may perform well but subjective ratings are very low.
- Performance measures include: success, time, errors, etc.
- Subjective measures include: user’s self reported satisfaction and comfort ratings.