This article is a follow up to树测试第1部分:快,Iterative Evaluation of Menu Labels and Categories

Tree testing evaluates the categories and labels in an信息架构。We recently explained the process for设计树测试;一旦您计划您的学习,下一步是收集数据并解释结果。不像思考 - 大声思考usability testing, most tree tests are run as unmoderated studies and generate only quantitative results. This method allows you to quickly collect data from a large number of users, but requires a different approach to extracting insights. You can’t just sit through a day of testing and jot down notes, but instead you need to take a systematic analysis to identify data trends and evaluate their significance.

Collecting Data

Study participants。就像使用可用性测试一样,必须是一个很好的树立测试研究recruit representative usersas study participants, particularly for products with specialized target audiences. Don’t recruit college students to test a website about life insurance.

Since tree testing allows you to easily collect data from a large group of users, aim forat least 50 users那to allow trends in user behavior to emerge and minimize the impact of any unmotivated participants who provide poor-quality data. If you plan to test two trees and compare their performance, you’ll need twice as many participants, because the comparison requires a between-subjects study design (i.e., different people test each version).

Tasks per participant。确保每个参与者执行在ly10任务(或更少)。尽管树木测试任务可以快速完成,但让人们连续执行30个任务仍然不是一个好主意。一旦有人点击了相同的菜单15次,它们就会与刚刚登陆网站的普通用户一样,他们完全从未见过菜单。如果您需要测试超过10个任务,请招募更多用户,并使用树测试工具的随机化功能为每个参与者分配10个任务。

Pilot testFinally, invite a small number of users to complete the study and review their responses before sending it to your entire group. The pilot test can expose any unintended problems with your task wording early enough to correct them.

Tree-Testing Metrics

Once the results are in, a variety of metrics capture how users understood (or misunderstood) your categories.treeUserZoom,两个最常见的树木测试工具,每个都使用略有不同的风格来呈现这些指标,但两者都为您的研究中的每个任务提供了这些定量措施:

成功率:The percentage of users who found the right category for that task

直接:The percentage of users who went to the right category immediately, without backtracking or trying any other categories


Path measures:

  • 每个类别的选择频率
  • First click:the category most people selected first
  • 目的地:大多数人被指定为最终答案的类别

Depending on the type of tree and tasks in a study, some of these metrics may be more useful than others at predicting how well the information architecture will perform in real life.



Tree-testing tools calculate a success rate for each task for which you define a ‘correct’ answer. This screenshot from UserZoom indicates that 67% of users found the correct location for the task你能在哪里找到方向和工作时间对于新墨西哥州State Library


Remember that, by its very nature, tree testing eliminates many helpful design elements, such as the search function, secondary navigation options (like related links), and any context cues from the visual design or content. Users see only the stripped-down navigation menu itself.

Example of the user interface shown to participants in a tree test
A tree test shows participants only the task instructions and a stripped-down menu of category labels, as you can see in this task from a UserZoom tree test. Users do not have access to a search function, content, layout,dropdown menus或者有关帮助解释菜单选项的任何其他上下文。


Instead of expecting to achieve a 100% success rate, use a more realistic frame of reference to evaluate what success rate is acceptable for each task, taking into account:

  • The importance of that task to the overall user experience
  • 每个成功率如何比较其他类似的任务(例如,在层次结构中的同一级别上的定位内容的任务)

For example, consider two tasks and their respective success rates in the table below. The success rate for the food-stamps task is much lower than for the other task, but this result is partially because users must drill down three more levels to find the right answer.

Task 正确答案) 成功率
State Library?





Find the rules
that determine
for food stamps
in New Mexico.


>Health and Wellness



>Looking for Assistance




Rather than comparing these two success rates, it would be more realistic to compare either:

  • 食品券的成功率为另一个任务的任务,也针对6个级别的内容;或者
  • The success rate of the food stamps task performed on two different trees with different labels — one which uses the term粮食援助和一个有这个词食品券。

Directness and Time Spent

In addition to measuring how many users got to the right place, it’s important to also consider how much they struggled on the way. Two common tree-testing metrics signal this:所花费的时间那which indicates how long it took users to find the right answer, anddirectness,捕获了许多用户立即到达右答案的用户,没有回溯或更改类别。直接导航也有时称为“快乐的路径”,因为它表明交互平滑,混淆或绕行。

如果用户必须在最终找到正确的答案之前尝试多个地方,则具有高成功率的任务仍然是一种差的用户体验。例如,考虑这项任务关于找到学费的成本。尽管用户最终找到了正确的答案,但其中只有50%的直接路径。。在找到它之前,一半成功的用户必须至少回避一次步骤 - 尽管这些信息实际上是树中的3个不同位置。

Example tree test result showing both success rate and directness
This task result from a Treejack study indicates that even though 74% of all users were successful at finding the tuition amount, half of those people took an indirect path and had to retrace their steps at least once.



Success rate and directness tell you一种类别是否有可发现;详细的途径分析有助于您弄清楚如何改善类别that don’t work well.


The first click is critical because it often predicts whether a user will eventually be successful in finding the right item. Imagine you are looking for the food court in a shopping mall. If the food court is on the top level and you start by taking the escalator down, your chances of finding it any time soon are slim. But if you start by going to the right level, chances are you’ll be able to wander around a bit and find it, if only by the smell of food.

第一个单击以相同的方式运行。一旦用户获得了正确的类别附近,上下文提示和本地导航使他们更有可能找到它。但第一次点击不正确通常是灾难性的;下表显示了只有20%成功率的任务的第一个单击数据。正确的顶级类别,目录那received only 14% of the first clicks. Instead users started in theProgram或者Schoolsections, and most ended up wandering around those areas and never making it back to the目录

Only 14% of users clicked目录作为他们在寻找教导环境法的教师名单时的首选;这导致整体任务成功率仅为TreeJack研究中的此任务中的20%。

Examine the first click data carefully when:

  • A task has low success rate and/or directness. The first clicks indicate where users initially expected to find that information, and suggest locations where the item should be moved (or at least crosslisted).
  • 最终设计将使用Mega Menus.这两个揭露了2nd和3rd.级别类别一目了然。能够在同时查看和比较多个卸级的能力可以大大提高成功率,以上您将在树测试中观察到的成功率 - 但是如果第一次点击成功,则只有有效,并且用户将其成为正确的Mega菜单。

If you have many tasks where first clicks are distributed across multiple categories, you may have too many overlapping categories. Do a card sort, or review the tree-test results again and look for other possible organization schemes.


First clicks and final destination for the task of looking up information about an arts festival; all users correctly clicked into the Recreation category, but 35% selected艺术与文化as their final destination, while 22% chose探索魅力的土地。只有30%正确选择在新墨西哥州发生了什么事。This result indicates that these sibling subcategories overlap too much and either of them feel as appropriate destinations for the users.




  • When first clicks are evenly distributed in multiple areas, list topics in multiple categories. If this issue occurs for many tasks, consider changing the overall organization scheme.
  • When success rate is low but first clicks are correct, change the labels of subcategories to be more distinct.

了解有关我们全天课程中的选择组织方案和标签的更多信息Information Architecture