The action of using your fingertips to zoom in and out of the image is an example of a direct-manipulation interaction. Another classic example is dragging a file from a folder to another one in order to move it.

Moving a file on MacOS using direct manipulation involves dragging that file from the source folder and moving it into the destination folder.

定义:直接操纵(DM)is an interaction style in which users act on displayed objects of interest using physical, incremental, reversible actions whose effects are immediately visible on the screen.

Ben Shneiderman first coined the term “direct manipulation” in the early 1980s, at a time when the dominant interaction style was the command line. In命令行界面,TH.e user must remember the system label for a desired action, and type it in together with the names for the objects of the action.


Direct manipulation is one of the central concepts of graphical user interfaces (GUIs) and is sometimes equated with “what you see is what you get” (WYSIWYG). These interfaces combine menu-based interaction with physical actions such as dragging and dropping in order to help the user use the interface with minimal learning.


In his analysis of direct manipulation, Shneiderman identified several attributes of this interaction style that make it superior to command-line interfaces:

  • Continuous representation of the object of interest。用户可以看到他们可以与之交互的对象的可视表示。一旦他们执行行动,他们就可以看到它对系统状态的影响。例如,当使用拖放移动文件时,用户可以看到源文件夹中显示的初始文件,选择它,并且一旦操作完成,他们就可以看到它从源中消失并出现在目的地 - 立即确认他们的行动具有预期结果。因此,通过定义,直接操纵UIs满足第一个可用性启发式: the visibility of the system status. In contrast, in a command-line interface, users usually must explicitly check that their actions had indeed the intended result (for example, by listing the content of the destination directory).
  • 物理操作而不是复杂语法。通过点击,按钮,菜单选择和触摸手势,物理地调用操作。在Move文件示例中,拖放在现实世界中具有直接模拟,因此移动操作的实现具有正确的意小器,可以很容易地学习和记住。相比之下,命令行界面不仅需要用户不仅重新登录命令(“mv”),还需要涉及的对象的名称(文件和目标文件夹的文件和路径)。因此,与DM接口不同,命令行接口基于召回而不是承认并违反了一个重要的可用性启发式。
  • Continuous feedback and reversible, incremental actions。由于系统状态的可见性,易于验证每个操作导致正确的结果。因此,当用户犯错误时,他们可以立即看到错误的原因,他们应该能够轻松地撤消它。相比之下,利用命令行接口,一个单个用户命令可以具有可能导致错误的多个组件。例如,在下面的示例中,目标文件夹的名称包含拼写错误“测量usablty”而不是“测量可用性”。系统只是假设文件名应更改为“测量USablty”。如果用户检查目标文件夹,他们会发现存在问题,但将无法了解是什么导致它的:他们是否使用了错误的命令,错误的源文件名或错误的目的地?

This type of problem is familiar to everyone who has written a computer program. Finding a bug when there are variety of potential causes often takes more time than actually producing the code.

  • Rapid learning.Because the objects of interest and the potential actions in the system are visually represented, users can use recognition instead of recall to see what they could do and select an operation most likely to fulfill their goal. They don’t have to learn and remember complex syntax. Thus, although direct-manipulation interfaces may require some initial adjustment, the learning required is likely to be less substantial.

Direct Manipulation vs. Skeuomorphism

When direct manipulation first appeared, it was based on the office-desk metaphor — the computer screen was an office desk, and different documents (or files) were placed in folders, moved around, or thrown to trash. This underlying metaphor indicates the skeuomorphic origin of the concept. The DM systems described originally by Shneiderman are alsoskeuomorphic— that is, they are based on resemblance with a physical object in the real world. Thus, he talks about software interfaces that copy Rolodexes and physical checkbooks to support tasks done (at the time) with these tools.

As we all know, skeuomorphism saw a huge revival in the early iPhone days, and has now come out of fashion.


虽然SkeOomorphic接口确实基于直接操纵,但并非所有直接操纵界面都需要是伴侣。事实上,今天的flat interfacesare a reaction to skeuomorphism and depart from the real-world metaphors, yet they do rely on direct manipulation.


Almost each DM characteristic has a directly corresponding disadvantage:

  • Continuous representation of the objects?It means that you can only act on the small number of objects that can be seen at any given time. And objects that are out of sight, but not out of mind, can only be dealt with after the user has laboriously navigated to the place that holds those objects so that they can be made visible.
  • Physical actions?One word: RSI (repetitive strain injury). It’s a lot of work to move all those icons and sliders around the screen. Actually, two more words: accidental activation, which is particularly common on touchscreens, but can also happen on mouse-driven systems.
  • 持续反馈?Only if you attempt an operation that the system feels like letting you do. If you want to do something that’s not available, you can push and drag buttons and icons as much as you want with no effect whatsoever. No feedback, only frustration. (A good UI will show in-context help to explain why the desired action isn’t available and how to enable it. Sadly, UIs this good are not very common.)
  • Rapid learning?是的,如果设计很好,但在实践中,可读性取决于界面的设计程度如何。我们都看到了具有较差的标签,未看出的按钮,或者没有比屏幕长度看起来不可点记的按钮或下拉框的按钮。


  • DM很慢。If the user needs to perform a large number of actions, on many objects, using direct manipulation takes a lot longer than a command-line UI. Have you encountered any software engineers who use DM to write their code? Sure, they might use DM elements in their software-development interfaces, but the majority of the code will be typed in.
  • 重复任务不受欢迎。DM接口是很棒的新手因为他们是容易学习,而是因为他们re slow, experts who have to perform the same set of tasks with high frequency, usually rely on keyboard shortcuts, macros, and other command-language interactions to speed up the process. For example, when you need to send an email attachment to one recipient, it is easy to drag the desired file and drop it into the attachment section. However, if you needed to do this for 50 different recipients with customized subject lines, a macro or script will be faster and less tedious.
  • Some gestures can be more error-prone than typing。虽然理论上,由于持续反馈,DM最大限度地减少了某些错误的可能性在实践中,当手势更难执行时,存在比键入等效信息更难的情况。例如,祝你努力移动50TH.column of a spreadsheet into the 2n使用拖放的位置。对于这种确切的原因,Netflix为重新排序的订户的DVD队列提供了3种交互技术:将电影拖动到所需位置(便于短暂的移动),一个按钮快捷方式移动到#1位置(当您时友好必须观看特定的电影),以及键入所需新位置的数量的间接选择(在大多数其他情况下有用)。
Netflix allows 3 interactions for rearranging a queue: dragging a movie to the desired position (not shown), moving it directly to top (搬到顶部option), or typing in the position where it needs to be moved (搬去option).
  • 无障碍may suffer.DM UIS可能会失败视力受损的用户或具有电机技能损伤的用户,特别是如果它们是基于物理动作的大量,而不是按钮按下和菜单选择。(替代方法,但它可能很难实现它们。)


如果没有直接操纵,很难想象现代接口。几乎任何针对广泛观众的接口,都是基于DM的图形组件。随着触摸屏设备的爆炸,我们已经看到DM UIS从原始办公隐喻出发并在各种域中创新。和增强 - 现实和虚拟现实系统将推动DM甚至更新。

Despite the many downsides, we still recommend a heavy dose of direct manipulation for most UIs. Direct manipulation often enhances users’ sense of empowerment over the computer by letting them feel that they are in control and are the ones making things happen. The upsides of DM usually enhance usability more than the downsides degrade it. Any interaction style has its minuses and can be ruined by lack of attention to the details: there is no magic bullet for UX, but there are definitely design ideas that can advance usability if employed correctly, and direct manipulation has proven to be one of these good ideas for more than 30 years.


Shneiderman,B. 1983。直接操纵:超出编程语言的步骤。Computer16.(8), pp. 57–69. (Access-contolled archival copy可在ACM数字图书馆提供。)