User Interfaces
Dwight H. Lillie
http://www.netsci.org/Science/Cheminform/feature05.html
Introduction
Good user interfaces are a bit like good art - hard to describe, but when you see one, you know it. This is largely because the interface itself is only a small part of a much larger set of considerations. The development of the interactive components of computer-based applications can be modeled as an interaction between two discrete processes. The first process is a set of activities that may be referred to collectively as "workflow analysis." The second process, which folds in the results from the workflow analysis, is the final design of the user interface. Each of the processes is considered explicitly or implicitly during the development of user interfaces.
Workflow analysis of processes is an essential first step in design because hidden, required functionalities are often revealed. Thus, workflow analysis of computer systems (and their target applications) frequently yields detailed information about who the users are, a logical breakdown of their activities and the requirements for hardware, software and data. As a result, the techniques have been shown to be useful in pointing out the "hot" spots in users' environments. The delivery of effective computational tools is facilitated when an understanding of the user's needs is the first step of the analysis process. For example, placing arbitrary restrictions on hardware and software choices that can be considered in expressing possible solution architectures may short circuit the workflow analysis processes, leading to possibly incorrect conclusions.
An apocryphal example which has been repeated numerous times, concerns a company that wanted to increase the productivity of their systems analysts and developers. Rather than examining the workflow of these teams, investments were made in new design methodology. This included retraining the staff, development of in-house seminars for follow-on training, investment in software to support the methodology, etc. The end result of this expenditure was little increase in overall productivity. In this case, an analysis of the users' principal complaints would have pointed to the need for more computing capacity so as to reduce the edit-compile-link-debug cycle.
One example of the impact of predefining solutions to interfaces is the emergence of Microsoft's Windows user interface as the de facto standard for desktop business and personal computer systems. In effect, this has channeled the development of graphical user interfaces (GUIs) to those which conform to conventions established by Windows. Deviations from these conventions, even when they improve overall performance, are likely to be ignored unless they are adopted by Microsoft.
Some History (...but not too much)
The first interface designed for interacting with computers was a typewriter style keyboard for input and a printer and/or a visual display tube (VDT) for output. There are still some environments in which these are the computing tools of choice. The VDTs, sometimes using controlling electronics that were as complex as the computers themselves, were generally capable of displaying only fixed font text on lines of 80 characters each, 24 lines per "page." Very expensive VDTs were capable of displaying non-textual information via vector graphics.
This usage model underwent a large shift when Xerox integrated hardware and software components into a single workstation that allowed users the simple capability of switching between multiple "terminal" sessions by simply moving a visual cursor (pointer) via a hand operated pointing device, i.e., mouse. This capability resulted in both a perceptual and measurable increase in productivity - the latter being a somewhat unusual event.
Evolution
The intervening time period has seen this model of workstation computing increasingly refined. Alternatives to the original mouse have been explored, with most of them being discarded or relegated to specific application niches. Variations on the keyboard, both key mappings as well as physical designs, have also been tried. Interestingly, even though operating system software has evolved such that keyboards can be remapped at will, very few users have taken advantage of these features. While alternative physical designs have become increasingly available, higher costs (relative to conventional designs) have led to limited acceptance of these models.
Software capabilities have become more refined as well. Commercial application software is generally no longer designed for simple, text only terminals. Rather, it attempts to exploit underlying native hardware and software subsystems. Simultaneously, operating systems and their various support subsystems have also been refined and have become increasingly graphical in nature.
As the rapid changes in interfaces detailed above have unfolded, a rather curious pattern in systems evolution seems to have emerged. The requirements of leading free- or share-ware and commercial applications have outstripped the capabilities of the underlying operating system software. This has forced these leading edge applications to invent their own underlying technologies, which when successful, found their way into the next generation of operating systems.
GUIs...
There are several factors which have made Windows the dominant workstation Graphical User Interface. Foremost is the fact that the Windows environment provides color, shading, texture, variable pitch, and multiple font support for all of the applications which run under its umbrella. It also provides the capability of associating typical operating system level objects such as executable programs, databases, word processor files, directory structures, et cetera, as graphical images or icons that can be directly manipulated by simply moving a pointer over the image and "clicking."
A reasonable question, however, is what makes it a GUI? Users who are new to this environment tend to the use the mouse exclusively. With experience, these users tire of "clicking" their way down a series of cascading menus in order to perform a mundane action such as changing the typeface of selected text. Instead, they learn the keyboard shortcut, if one is available, for simple tasks and bypass some of the prime functionality of the GUI.
GUI Components
Most of the current GUIs consist of surprisingly few components or "controls." While almost all of these components convey information via text, the text is surrounded by objects which are defined by color, texture and 3D appearance(s) and have simple, primitive operations. For example, "Buttons" are used to select from a set of choices and "List Boxes" are used to convey single column information in multiple rows. Occasionally, the information set is larger than the size of the window. In this case, a scrollbar is attached to the "List Box." There are also text edit/entry controls that permit text to be typed (or copied/pasted). If these boxes are large enough to hold multiple lines of text, it will be automatically wrapped and left justified. Unfortunately, the use of these controls, especially on a screen that can hold several at once, is only made acceptable by overloading the < tab > key so that the application program switches the keyboard focus between the various control elements. This mode of operation appears to be the dominant method of selecting a control window for intermediate and experienced users. Inexperienced users do not appear to take full "advantage" of the functionality.
Difficulties with GUIs - Visual Overload
Continuing down the train of thought that leads us to the requirement for short cut keystrokes producers of the application may learn via surveys or feedback that users still want to perform an action visually; i.e., by selecting the action with a pointer. The introduction of the "toolbar" which provides iconic representations of shortcut keyboard actions solves this difficulty. Unfortunately, in the case of Windows, (following the trend in which next generation/release operating systems adopt functionality developed by previous applications) a semi-pathological condition has arisen from the fact that individual applications are able to display their own toolbar when active. The problem with this paradigm is seen when one application not only includes data files from a second application (i.e., Word including an Excel spreadsheet) but also takes control of the manipulation of this data. The result is a visual display that has various graphic icons, images, toolbars, pull down menus, and other such controls dominating over 50% of the screen while leaving very little space for the display of the information being manipulated.
Another problem with current GUI definitions is that a very large number of graphic images are being used to represent various actions in these toolbars. Except for a very small subset of these images, their meaning and use is often not intuitive. As a result, users are forced to resort to printed or on-line help. In either case, the help itself must be selectable via an image since it cannot be textually searched. Word processing programs serve as a good example of these problems. Using the tool definitions above, the context of a pair of scissors in a word processor can means "cut," but an icon showing several overlapping sheets of paper will have no intrinsic contextual reference. To solve this problem, these graphic images now support pop-up descriptions so that if the mouse is on the image for a finite period of time then a message box pops up with the appropriate set of words to describe the image. Programs have thus moved from being text-based to graphics-based to an unsatisfactory hybrid of both representation styles.
Overlapping Usage Models
Windows and some of the bundled applications (e.g., FileManager) support different usage models for manipulating information. The COPY function in Windows is a good example. For instance, to copy a file from one location to another, select the Copy item under the File Menu. A "dialog" box then is popped up so that they can type the name of the file to "copy from," and the name of the file to "copy to." This can be referred to as an "operational" or "functional" model. Unfortunately, when this model is grafted onto the graphical interface, it is much less effective than typing a simple command to copy.
Another model, referred to as an "object" model, is supported by the FileManager. A user selects the graphical object via the pointer and "drags" it, while holding down the mouse button, to a destination. The model is a bit non-functional in that, if the destination isn't visible at the time the file is selected, then the user needs to create a new window in the FileManager where the destination is available. The behavior is also unexpected in that the operation is a "copy" if the movement is across disk systems, but "move" if it occurs within a directory structure on the same disk. While nothing is inherently wrong with the behavior nor with the conditional differences, they are not intuitive and require more information than can be provided easily in a GUI environment.
By way of contrast, OS/2 seeks to implement an object model that appears to be more "intuitive." As in the FileManager example above, the user selects the object (file) to copy and then invokes a property action from a menu activated by clicking the right mouse button.
Limited Expressiveness
Experienced users of command line interfaces, particularly in Unix environments, are quite used to a data flow method of operation. Information flows, or is "piped," from one state to another by feeding it through one or more filter operations. For example,
pict <foo.txt | psroff -ms -t -FCourier | rsh bar lpr -Pps
causes the information in foo.txt to be piped through the program pict (a program that can processes vector picture commands). The output from pict is piped into psroff, a text formatter that produces postscript output, and the output from that program is piped to the line printer (ps) on the remote system (bar). psroff consists of several programs for processing macro commands, the text formatter, a text to postscript converter and other routines.
Expressing this set of operations via current GUI operations has been historically difficult. Comparable operations such as insertion of an Excel spreadsheet inside of a Word document now exist and have become easier, however, if there is a set of documents for which this operation must be applied then the operations must be performed individually by hand. In contrast, the above procedure can be done automatically by encapsulation inside a single script command. While there is a Microsoft Word macro language that could do the equivalent of the Unix script command, use of this facility cannot be considered a GUI application.
Exposure of the End User to Foundation Technology
New application software generally is being developed using methods that focus upon object models rather than functional models of the data (new techniques, such as Object Modeling Techniques (OMT) advocate a synthesis of object, data flow, and event tracking methods). This bias often is exposed, either explicitly or implicitly, to the user through the GUI without consideration of the user's actual tasks. When users have no preconceived expectation of the way in which tasks should be accomplished in software, problems may be minimal. However, if the user already has a set of expectations for usage of the software, perhaps because of experience with other systems or because of the similarity with other tasks, the resulting behavior change may not only be uncomfortable, but often unacceptable.
An example of this problem is reflected by the case of a user who performs text searching in a word processor. The process followed is straightforward:
- Enter the Search Command
- Enter the text to be searched
- Review the first match
- ...
In an object method, the user would select text to be searched, the text to be used as the query, and then apply the search method.
- Select text to be searched
- Select query
- Apply Search
- ...
For users accustomed to following a process or procedure in order to accomplish a task (as opposed to first selecting a method and then applying the method to a set of objects) this represents a change in how they complete tasks.
Spanning Multiple Disciplines
Biosequences...
At this point, our discussion has only examined requirements for users performing relatively mundane activities. If we include multiple disciplines in our consideration as well, we quickly discover that there are some very complex concepts that are not easily abstracted/generalized. For example, scientists working with biosequences think of them at several levels of abstraction. If the issue is one of determining hydrophobic versus hydrophilic regions of the sequence, then they are generally interested in the proximity and size of polar, nonpolar, uncharged, and charged R-groups that hang off the amino acid backbone. The presence of known disulfide cross-linkages may also be of interest.
Knowledge of active catalytic and inhibitory sites raises the abstraction to a higher level. In this case, one views the rest of the biosequence as providing a structural platform on which to build the active machinery. Specific replacements of individual amino acids or chemical modification of one of the R-groups can render the entire biosequence inactive. In these cases the individual location of the amino acids and the composition of their side chains is essential.
Depending on the level of abstraction, the representation of the biosequence becomes critical. If the entire biosequence is represented and depicted as individual atoms, bonds, and charges then the user becomes overwhelmed with information. In the example below, the simple 2D amino acid sequence Ser-Gly-Tyr-Ala- Leu is given as collection of atoms and bonds. Imagine the complexity of adding 3D information to the representation.
![[Sequence]](Images/Lillie/figure1.gif)
For biosequences of any size this method of depiction rapidly becomes unusable. Compounding the issue however is that, until recently, most commercial chemical and biological drawing software provided no method for interconversion between the string representation (Ser-Gly-Tyr-Ala-Leu) and a computable form of representation (i.e., atoms, bonds, charges, etc). As a consequence, input of this type of structure was forced to the atom and bond level which resulted in a loss of structural information. The alternative to atom/bond-based input was to allow the sequence to be typed as one or three letter codes and to perform post-processing on the input to convert to the atom/bond representation.
Chemical Structures...
A different twist on representation, also applicable to biosequences, is the simple representation of chemical structures. Chemists think in terms of multiple abstractions depending on their view of the data. For example, in the structures given below, there are three discrete structures on the left and a simple Markush structure on the right in which the three substituents are represented as a single Gk group.
![[Markush representations]](Images/Lillie/figure2.gif)
Mapping between the two levels of abstraction illustrated above is very important. The representational method on the right may be used in a patent document as a simple way to claim all of a particular set of structures. It also may be used to readily define the set of structures upon which a specific set of measurements or calculations were performed.
These capabilities represent a requirement that users of chemical information systems have stated repeatedly - search through a set of structures, determine the smallest set of unique compounds with substituents which answer a given query, and generate Markush or generic structure depictions for the set. Users of reaction information systems have come to expect this type of capability as many of the systems they use provide a high level (A + B -> C -> D + E) overview as an answer to a query. However, layered levels of abstraction represent a function of the underlying chemical and reaction models that were designed into the databases rather than a capability which was built into the chemical database system.
One of the overriding precepts in chemistry is the statement that "structure dictates function." Thus, visualization of a structure in three dimensions plays an enormous role in helping scientists understand its mechanism of action. Crystallographers, for example, continue to build large three dimensional physical representations of their compounds to aid in data analysis. A further complication arises when one considers that although a simple 2D representation of a molecule is sufficient to provide the atom and bond list required to store a compound in the database, there are multiple 3D conformations that can be represented by the simple 2D graph.
![[3-D representations]](Images/Lillie/figure3.gif)
From an information management point of view, this is a disaster. What do you put into a 3D database? If 3D conformations are generated as needed, but descriptors are put into the database to aid searching, then logs need to be maintained to define program versions, conditions under which the program was run, and a series of other administrative issues. None of this guarantees that the conditions used to put the data into the computer can be recreated for use in retrieval.
Complicating the complexity of the GUI for database systems is the multiplicity of abstraction levels that can be independently applied to a particular view of data. These abstractions are applied unconsciously by the individuals and are difficult to elicit.
Next Steps
Returning to the generic theme of interfaces, the set of GUIs (i.e., Windows) currently available have serious shortcomings in that they have failed to provide a consistent interface that accurately maps to a user's method of completing tasks. Successful users, those who have seen true productivity increases, have mapped their needs to the functionality that the GUI provides. Is there an alternative solution? Maybe.
It is clear that many of the necessary components for a successful GUI design are available and are beginning to make themselves felt in the marketplace. Many others are out there, but their introduction will be delayed. An example is the appearance of Microsoft's Bob.
Bob is not successful because it works well and makes users feel good about their use of the computer to accomplish their tasks. Rather, the program displays components that "feel" right; office tasks are defined in terms of the contextual navigation of metadata. Workflow analysis was used to examine how an individual finds his/her way around the office and the way in which information is found is placed into its proper context:"Where did I place that file on Jones yesterday?....oh yes, I left it with the McKearney papers...", "Where's my scratchpad?..."
In accomplishing these tasks, the user shifts piles of papers on the desk, rummages through an inbox, may opens a file cabinet and browses down the folders.. This capability is easily captured in the Windows GUI although the GUI doesn't provide the necessary spatial, visual, auditory, or other clues that facilitate contextual links to office procedures. Thus, we can use real-world offices effectively, but struggle to do similar activities within a flat, context-free GUI.
The continued integration of "multi-media" components (highly textured visual images and sound), new hardware devices for auditory ("surround-sound", a term long in use in stereo markets), visualization ("surround-vision" or virtual reality), and information input allows us to model the real world more and more closely. As these models evolve, then the mismatches that arise from users' expectations that things will operate as they do in the real world will be minimized.
NetSci, ISSN 1092-7360, is published by Network Science Corporation. Except where expressly stated, content at this site is copyright (© 1995 - 2010) by Network Science Corporation and is for your personal use only. No redistribution is allowed without written permission from Network Science Corporation. This web site is managed by:
- Network Science Corporation
- 4411 Connecticut Avenue NW, STE 514
- Washington, DC 20008
- Tel: (828) 817-9811
- E-mail: TheEditors@netsci.org
- Website Hosted by Total Choice