许多UX专业人员倾向于定性(Qual)方法,这些方法被广泛被认为更容易和便宜quantitative (quant) research。They shy away from the intimidating prospect of larger sample sizes and statistics associated with quant.

If that sounds like you, you’re missing out! Quant methodologies are an important part of any experienced UX researcher’s toolkit. Quant methods allow you to:

  • putnumberon the usability of your product. Numbers are sometimes more persuasive than findings and videos from qual testing (particularly when you’re trying to convince folks like executives).
  • Comparedifferent designs (for example, your new version vs. your old version, or your product vs. your competitor’s product), and determine whether the differences you observe arestatistically significant,而不是由于随机的机会。
  • ImproveUX trade-offdecisions。For example, if a proposed design improvement is expected to be expensive to implement, is it worth doing? If you have an estimate of how much the change will improve the usability, a quant method may help you decide whether the redesign is worth it.
  • Tie UX improvements back to organizationalgoalsand key performance indicators (thus demonstrating yourreturn on investmentand justifying your UX team’s existence).

This article can help you get started — the first step is determining which quant UX research method you need. We’ll cover some of the most popular types of quant research:

  • 定量可用性测试(基准测试)
  • Web Analytics (or App Analytics)
  • A/B Testing or Multivariate Testing
  • Card Sorting
  • Tree Testing
  • Surveys or Questionnaires
  • Clustering Qualitative Comments
  • Desirability Studies
  • Eyetracking Testing

Each of these methods yields valuable quantitative data, but the techniques vary widely in the type of data collected, as well as the amount of resources and effort required.

本文列出了这些方法最常见的用例,并估计每个成本和难度。与任何研究方法一样,这些都可以适应符合各种需求。根据您的特定情况,您的成本和难度可能与我们的粗略估计不同。此外,您应该意识到这些方法中的每一种都需要不同的最小样本尺寸来确定statistical significance

定量可用性测试(基准测试)

  • 使用:
    • Tracking usability over time
    • Comparing with competitors
  • 成本:Medium
  • Difficulty of Collection:Medium
  • 分析难度:Medium
  • 类型of Method:Behavioral (what people do)
  • 使用背景:任务为基础

Although not used as often, quantitative usability testing (sometimes referred to as usability benchmarking) is a lot like qualitative usability testing — users are asked toperform realistic tasksusing a product. The primary difference between the two is that qual usability testing prioritizes observations, like identifying usability issues. In contrast, quant usability testing is focused on collecting metrics like time on task orsuccess

Once you’ve collected those metrics with a relatively large sample size (around 35 participants or more), you can use them to track the progress of your product’s usability over time, or compare it to the usability of your competitors’ products.

When you track a usability metric over time, across many different iterations of a product, you can create charts like this one. This type of information can help you keep an eye on your product’s UX, and make sure it improves over time.

您选择的可用性测试类型(亲自,远程审核或远程未解密)将影响与此方法难以关联的成本和困难。由于量子和质量研究的目标是不同的,因此structureof the test and thetasksused will need to be different as well.

For all the skills you need to run a basic quantitative usability-testing study, see our full-day courseMeasuring User Experience

Web Analytics (or App Analytics)

  • Uses:
    • 检测或优先考虑问题
    • Monitoring performance
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:High
  • 类型of Method:Behavioral (what people do)
  • 使用背景:Live

Analytics data描述与你的生活促使人们做什么uct — where they go, what they click on, what features they use, where they come from, and on which pages they decide to leave the site or app. This information can support a wide variety of UX activities. In particular, it can help you monitor the performance of various content, UIs, or features in your product, and identify what doesn’t work.

For an explanation of the differences between analytics and quant usability testing, watchthis 2-minute video

For more on analytics with a special focus on how these methods fit within UX, see our full-day courseAnalytics and User Experience

A/B Testing or Multivariate Testing

  • 使用:Comparing two design options
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:低的
  • 类型of Method:Behavioral (what people do)
  • 使用背景:Live

虽然您可以使用分析度量来监控产品的性能(如上所述),但您还可以创建检测不同UI设计如何更改这些指标的实验 - 或者通过A / B测试或多变量测试。

In A/B testing, teams create two different live versions of the same UI, and then show each version to different users to see which version performs best. For example, you might create two versions of the same call-to-action button label:Get Pricingvs.Learn更多的。Then you could track the number of clicks that the button receives in the two versions.Multivariate testing是相似的,但涉及到测试几个设计埃尔ements at once (for example, the test could involve different button labels, typography, and placement on the page.)

Both of these analytics-based experiments are great for deciding among different variations of the same design — and can put an end to team disputes about which version is best.

A/B testing splits your incoming site traffic (users), and directs some users to one version of the UI, and others to the other version.

A major downside to this methodology is that it’soften abused。Some teams fail to run the tests as long as they should, and make risky decisions based on small numbers.

For more on A/B and multivariate testing for UX, see our full-day courseAnalytics and User Experience

Card Sorting

  • 使用:Determining information-architecture labels and structures
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:Medium
  • 类型of Method:Attitudinal (what people say)
  • 使用背景:Not using product

In acard-sorting study, participants are given content items (sometimes literally written on index cards) and asked to group and label those items in a way that makes sense to them. This test can either be conducted in person, using physical cards, or remotely using a card-sorting platform like OptimalSort.

When card sort tests are conducted in person, the user sorts and categorizes physical cards. Each card contains a description of the content it represents.

This method gives you the opportunity to get into users’mental modelsof the information space. What terminology do they use? How do they logically group these concepts together?

Quantitative analysis of the percentage of participants who created similar groupings can help establish which categorization approach would be understandable to most users.

Tree Testing

  • 使用:评估信息架构层次结构
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:Medium
  • 类型of Method:Behavioral (what people do)
  • 使用背景:任务为基础, not using product

In atree test,参与者尝试仅使用网站的类别结构来完成任务。通过将其与UI的所有其他方面隔离,基本上是评估您的信息架构的一种方式。

For example, imagine your product is a pet-supplies website, and this is your top-level hierarchy.

您的层次结构的可视化可能看起来像这样。要求树测试中的参与者在层次结构中找到特定项目(例如,衣领)。他们首先只看到顶级类别(例如,狗,猫,鸟等)一旦他们做出选择(狗),他们就会看到选择的儿童类别。

You might ask your participants in one task to find the dog collars. Quantitative analysis of the tree-test results will show whether people were able to find the right path to this item in the information hierarchy. How many participants picked the wrong category?

This method is useful in identifying if an IA structure, labels, and placements agree with people’s expectations.

For more information about how to design and evaluate information architecture, see our full-day courseInformation Architecture

Surveys and Questionnaires

  • 使用:Gather information about your users, their attitudes, and behaviors
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:低的
  • 类型of Method:Attitudinal (what people say)
  • 使用背景:Any

Surveys are a flexible user-research tool. You can administer them in a variety of contexts — as short intercept surveys on a live website, in emails, or after a usability test.

They can produce combination of quantitative and qualitative data — ratings, proportions of answers for each choice in a multiple-choice question, as well as open-ended responses. You can even turn qualitative responses to a survey into numerical data (see the following section on coding qualitative comments).

With semantic differential rating scales like this one, each radio button stands for a numerical value. Respondents can choose Easy to Use (1), Difficult to Use (5), or a value in between. The average response to this question measures the perceived difficulty of your app.

You can create your own custom surveys, or you can use one of the many建立了问卷(for example, the System Usability Scale or Net Promoter Score). An advantage of one of those questionnaires is that you can often compare your result to industry or competitor scores, to see how you’re doing. Even if you create your own custom questionnaire, you can still track your average scores over time, to monitor product improvements.

For more on designing surveys, as well as many qualitative user research methods, see our full-day courseUser Research Methods: From Strategy to Requirements to Design

Clustering Qualitative Data

  • 使用:Identifying important themes in qualitative data
  • 成本:低的
  • Difficulty of Collection:Medium
  • 分析难度:Medium
  • 类型of Method:Attitudinal (what people say)
  • 使用背景:Any

This technique is less of a data-collection methodology, and more of an analysis approach for qualitative data. It involves grouping observations from a qualitative study (for example, a diary study, survey, focus group, or interviews) based on common themes. If you have a lot of observations, you can count the number of instances when a particular theme is mentioned.

例如,想象一下你运行一个diary study要求参与者每次在日常生活中使用您的产品一次报告一周,以在他们使用您的产品的情况下理解的目标。当人们在工作中,在他们的家中或在旅途中或在旅途中时,您可以计算该实例。

This method can identify the prevalence or frequency of a specific theme or situation — for example, the frequency of a user complaint or of a UI problem.

This approach is a good way to mine numerical data from large amounts of qualitative information, but it can be quite time consuming.

Desirability Studies

  • 使用:Identifying attributes associated to your product or brand
  • 成本:低的
  • Difficulty of Collection:低的
  • 分析难度:低的
  • 类型of Method:Attitudinal (what people say)
  • 使用背景:任务为基础

定量期望研究attempt to quantify and measure some quality of a product — such as aesthetic appeal, brand strength, tone of voice. These studies can be customized depending on your research questions, but they generally involve first exposing participants to your product (either by showing them a still image or by asking them to use the live product or a prototype). Then you’ll ask them to describe the design by selecting options from a描述性词汇清单。具有代表您的人口的大型样本大小,趋势开始出现。例如,您可能有84%的受访者将设计描述为“新鲜”。

Eyetracking Testing

  • 使用:确定哪些UI元素正在分散注意力,可找到的或可发现
  • 成本:High
  • Difficulty of Collection:High
  • 分析难度:High
  • 类型of Method:Behavioral (what people do)
  • 使用背景:任务为基础

Eyetracking studies require special equipment that tracks users’ eyes as they move across an interface. When many participants (30 or more) perform the same task on the same interface, meaningful trends start to emerge and you can tell, with some reliability, which elements of the page will attract people’s attention. Eyetracking can help you identify which interface and content elements need to be emphasized or deemphasized, to enable users to reach their goals.

Eyetracking software can create a variety of visualizations using the aggregated gaze data (where users looked on the interface, represented here by the green dots).

A major obstacle to running eyetracking studies is the highly specialized, prohibitively expensive, and somewhat unstable equipment that requires lots of training to use.

If you’re considering running an eyetracking study, check out our free report on如何进行眼科研究

选择方法

Method

Typically Used for

成本

Difficulty of Collection

Difficulty of Analysis

类型

使用背景

Quantitative Usability Testing

Tracking usability over time

Comparing competitors

Medium

Medium

Medium

Behavioral

Task-Based

Web Analytics (or App Analytics)

检测或优先考虑问题

Monitoring performance

低的

低的

High

Behavioral

Live

A/B Testing

比较两种特定设计选项

低的

低的

低的

Behavioral

Live

Card Sorting

Determining IA labels and structures

低的

低的

Medium

Attitudinal

Not Using Product

Tree Testing

Evaluating IA hierarchies

低的

低的

Medium

Behavioral

Not Using Product

Surveys and Questionnaires

Gather information about your users, their attitudes, and behaviors

低的

低的

低的

Attitudinal

Any

Clustering Qualitative Comments

Identifying important themes in qualitative data

低的

Medium

Medium

Attitudinal

Any

Desirability Studies

Identifying attributes associated to your product or brand

低的

低的

低的

Attitudinal

Task-Based

Eyetracking Testing

确定哪些UI元素正在分散注意力,可找到的或可发现

High

High

High

Behavioral

Task-Based

该表提供了上面讨论的方法的摘要。

Start with Your Research Question

When trying to determine which quant method to use, lead with your research question. What do you need to know? Some of these methodologies are best suited to verygeneral research questions。For example:

  • How did our product usability change over time?
  • 我们与竞争对手相比如何进行?
  • Which of our problems have the biggest impact? How should we prioritize?

对于这些类型的问题,您可能会想要使用quant usability testing, web analytics, or surveys

Other methodologies work well when you have a morespecific questionyou want to answer. For example:

  • How should we fix our global-navigation categories?
  • What do most of our users think about our visual design?
  • Which of these two design alternatives should we use for the dashboard?

For these research questions, you’ll probably want to useA/B testing, card sorting, tree testing, coding qualitative comments, desirability studies, or eyetracking

There are some grey areas within those recommendations, however. For example, an A/B test may not be an option for your company, for security or technical reasons. If that’s the case, and you can afford it, you could do an in-person quant usability study to compare two prototypes. However, that isn’t the typical use for quant usability testing, so I did not discuss it here.

Consider the Cost

在研究问题之后,选择方法的第二个最具影响力的因素是成本。这些方法将根据您的实施方式的方式成本增加了很多。您使用的工具,您拥有的参与者的数量以及研究人员所花费的时间都会影响最终成本。为了使这更加复杂,许多团队具有广泛的研究预算。同样,这里的成本估计是相对的。

低的er-budget teams will rely on digital methods — remote usability testing, online card-sorting platforms like OptimalSort, A/B testing, and web or app analytics. As a rule of thumb, the in-person methodologies (such as in-person usability testing, in-person card sorts) tend to be more expensive because they require so much more of researcher’s time. Additionally, they can require travel and equipment rentals. Eyetracking is the most expensive methodology listed here, and should be employed only by teams with big budgets and research questions that warrant using it.

This chart shows where the quant methods discussed in this article sit in terms of their suitability for different levels of granularity of research questions (general to specific).

Next Steps

Once you’ve selected a method, learn about it! Do your homework to make sure you’ll be able to plan and conduct the study the way you’d like to, and to ensure you’ll get useful results. I’ve included links throughout this article to point you towards more resources for each method, as well as a资源section at the end.

需要注意:You can’t just collect metrics and start making decisions without doing any statistical analysis.It isn’t enough to just collect rating-scale responses from 5 users, take an average, and move on.

For each method discussed here, there are different recommended最小样本尺寸— the number of data points you’ll likely need to collect in order to have reliable data and determinestatistical significance。You’ll need to hit those minimum sample sizes. If you don’t, you have no assurance that your findings aren’t just a fluke.

Be sure to factor in the time you’ll need to research relevant statistical concepts for whichever method you select, as well as the cost of obtaining the correct minimum sample size. I promise, it isn’t quite as hard as it looks, and your quant data will be well worth the trouble.

资源

Measuring UX and ROI(全日制课程)

“理解统计学意义”(文章)

“What Does Statistically Significant Mean?”(文章)

“Quantitative Studies: How Many Users to Test?”(文章)

“How to Compute a Confidence Interval in 5 Easy Steps”(文章)

“Return on Investment for Usability”(文章)

Return on Investment (ROI) for Usability, 4thEdition (Report)

“When to Use Which User-Experience Research Methods”(文章)

“UX Research Cheat Sheet”(文章)

Measuring U’s sample size and confidence interval calculators(工具)

Quantitative Usability Testing

“Quantitative vs. Qualitative Usability Testing”(文章)

“准确性与定量可用性的见解”(文章)

“Writing Tasks for Quantitative and Qualitative Usability Studies”(文章)

“Remote Usability Tests: Moderated and Unmoderated”(文章)

“远程减少可用性测试:如何以及为什么Do Them”(文章)

“成功率:最简单的可用性度量”(文章)

Analytics

Analytics and User Experience(全日制课程)

“Analytics vs. Quantitative Usability Testing”(Video)

“Three Uses for Analytics in User-Experience Practice”(文章)

“Five Essential Analytics Reports for UX Strategists”(文章)

A/B Testing or Multivariate Testing

Analytics and User Experience(全日制课程)

“Putting A/B Testing in Its Place”(文章)

“Define Stronger A/B Test Variations Through UX Research”(文章)

“10 Things to Know About A/B Testing”(文章)

“Multivariate vs. A/B Testing: Incremental vs. Radical Changes”(文章)

Card Sorting

Information Architecture(全日制课程)

“Card Sorting: Uncover Users’ Mental Models for Better Information Architecture”(文章)

“Card Sorting: Pushing Users Beyond Terminology Matches”(文章)

“Card Sorting: How to Best Organize Product Offerings”(Video)

“How to Avoid Bias in Card Sorting”(Video)

Tree Testing

Information Architecture(全日制课程)

“Tree Testing: Fast, Iterative Evaluation of Menu Labels and Categories”(文章)

“Tree Testing Part 2: Interpreting the Results”(文章)

“使用树测试来测试信息架构”(文章)

Surveys and Questionnaires

User Research Methods: From Strategy to Requirements to Design(全日制课程)

“Beyond the NPS: Measuring Perceived Usability with the SUS, NASA-TLX, and the Single Ease Question After Tasks and Usability Tests”(文章)

“12 Tips for Writing Better Survey Questions”(文章)

“Cleaning Data from Surveys and Online Research”(文章)

Clustering Qualitative Data

“定量定性数据的5个例子”(文章)

“如何代码和分析逐字评论”(文章)

“Diary Studies: Understanding Long-Term User Behavior and Experiences”(文章)

Desirability Studies

“Desirability Studies: Measuring Aesthetic Response to Visual Design”(文章)

“Using the Microsoft Desirability Toolkit to Test Visual Appeal”(文章)

“微软渴望工具包产品反应词”(文章)

Eyetracking Testing

如何进行眼科研究(Free report)

“眼镜展示了任务场景如何影响人们的样子”(Video)