Section 2

2.1 Introduction

This section provides additional background for the remainder of this document. The first subsection looks at related evaluation efforts from the HCI and CSCW communities. The second subsection introduces the scenario-based approach, and the third subsection defines our terminology.

2.2 Methods of Interface Evaluation

Evaluations of human-computer interaction have traditionally been done by a number of methods, including field studies, laboratory experiments, and inspections. Each method assesses different aspects of the interfaces and places different demands on the developer, user, and evaluator.

Evaluations of collaborative technology are done best through field evaluations because they can, among other things, be used to assess social-psychological and anthropological effects of the technology (Grudin 1988). Field studies unfortunately require substantial funding, a robust system, and an environment that can accommodate experimental technology. These three conditions are difficult to satisfy, and so researchers turn to less onerous means, such as inspection methods and laboratory exercises.

System inspections, such as cognitive walk-through (Polson et. al. 1992), heuristic evaluation (Nielsen and Molich 1990), and standards inspection (for example, Open Software Foundation Motif inspection checklists), employ a set of usability guidelines written by usability experts. There exists a fairly large set of guidelines for user interface design and single user applications, but few guidelines are available for multi-user applications or collaborative technology. Also, these methods are largely qualitative, not quantitative, and require HCI expertise that may not always be available.

This leaves laboratory experiments, or empirical testing, which as an evaluation technique more closely relates to field studies than inspection techniques. Experiments are very appealing for new and rapidly evolving technology and are potentially less expensive than field studies. However, since they are conducted in controlled environments with time restrictions they less accurately identify dynamic issues such as embedding into existing environments, learning curves, and acculturation. Watts et al. (1996) recommend compensating for this flaw by performing ethnographic studies followed by laboratory experiments. The ethnographic study helps evaluators understand the “work context” which influences what is measured and observed in the laboratory.

A generic process cannot presuppose a specific work context. Rather, we have chosen to develop scenarios and measures based on high-level collaborative system capabilities, to provide broad applicability across a range of applications. Ethnographic studies related to specific work contexts could provide a useful tool for validating the measures, because some measures may not be appropriate or of interest in certain contexts.

2.3 Use of Scenarios

A scenario is an instantiation of a generic task type, or a series of generic tasks linked by transitions. It specifies the characteristics of the group that should carry it out, and the social protocols that should be in place. It describes what the users should (try to) do, but not usually how they should do it. (Note that scenarios can be scripted in various degrees of detail and thus could constrain evaluator’s choices for how they accomplish tasks; scripts will be discussed later in this document.)

Scenarios are used in laboratory experiments to direct system usage. They are versatile tools that can be used for many development activities including design, evaluation, demonstration, and testing. When used for evaluation, scenarios exercise a set of system features or capabilities.

We would like to define general, reusable scenarios for collaborative technologies. This is a challenge, requiring consideration of a large set of technologies, but we can build on earlier work in this area.

In 1995, researchers met at the European Human-Computer Interaction (EHCI) conference to develop scenarios that could be used to design, evaluate, illustrate, and compare CSCW systems (Bass 1996). Their approach was to define generic tasks that would be used to construct technology specific scenarios. The tasks they described were mostly single user activities such as joining a conference, setting up a new group, and integrating a system with a collaborative tool.

Our approach begins by defining collaborative system capabilities or functional requirements. Our “capabilities” are defined at a higher level than the tasks defined by the EHCI researchers. The tasks we use to evaluate the capabilities are combined to build scenarios that are much more general than those described by the EHCI group. Consequently, the scenarios are appropriate vehicles for cross-technology comparisons. Many of the scenarios can be segmented if the complete scenario is too large to use for comparisons. Also, general scenarios can be readily adapted for any system that supports the capabilities required by the scenario.

Nardi (1995), who has extensively studied the use of scenarios for interface design, has argued for a provision to create a library of reusable scenarios. We will begin to populate such a library with scenarios that are technology-independent. Technology-specific scenarios could be added when scenarios are specialized for real systems.

2.4 Definition of Terms

This evaluation program is concerned with the three principle variables of participants, collaborative environments and collaborative activities. It is easy to say that participants are actors who engage in collaborative activities. Classifying collaborative environments and activities in meaningful ways takes a bit more work.

2.4.1 Collaborative Environments

A collaborative environment is a system that supports users in performing tasks collaboratively. It may be a particular piece of software, such as Lotus Notes or MITRE's Collaborative Virtual Workspace (CVW), or it may be a collection of components used together to enable collaboration.

We are charged with the task of providing an evaluation methodology for collaborative computing systems, present and especially future. Part of our approach involves examining the types of things an environment allows one to do in support of collaboration. To describe them, we must define these terms: requirement, capability, service, and technology.

Requirements for collaborative systems refer to the high level goals that a group needs to accomplish. For example, “I need to keep my colleagues informed of my progress on this project.”

Collaborative capabilities are relatively high-level functions that support users in performing particular collaborative tasks. Examples are concepts such as synchronous human communication, persistent shared object manipulation, archival of collaborative activity, etc.

The term service is used to describe the means by which one or more collaborative environments achieve a given capability, and technology is used to describe the particular hardware and/or software implementation of a service. For example, a service is email, and a technology is Eudora.

To tie together the four components of collaborative environments, consider the following. To satisfy the requirement of sharing information with colleagues, a group could use the collaborative capability of synchronous human communication. One service that may be used to achieve the goal in a variety of collaborative environments is audio conferencing. One technology for audio conferencing is Lawrence Berkeley Laboratory’s Visual Audio Tool.

Examining which requirements and capabilities a collaborative environment supports, and the services and specific technologies it uses to do so, is one way of generating a functional categorization of the collaborative environment. The categorization can be used to help determine which collaborative systems may be best suited for the proposed activities and which scenarios can be used to evaluate those systems.

2.4.2 Tasks or Collaborative Activities

Tasks or collaborative activities are what people do, or wish to do, together.

We use the term task synonymously with collaborative activity. Task is a term that transcends the level of detail at which the activity is described; task may refer to anything from a real activity that people are engaging in to a highly scripted mock-up of such an activity intended strictly for evaluation purposes.

A general work task (hereafter referred to as simply a “work task) is a particular objective of a group, such as making a decision, disseminating information, or planning for the next phase of a project. A work task decomposition based on McGrath’s categorization (McGrath 1984) is discussed in Section 3 to aid in generating specific measures and anticipated needed capabilities and services. A transition task is the activity necessary to move from one objective to another. For example, starting up a session, summarizing results of a decision, reporting the decision to absent colleagues, and assigning action items constitute transition tasks. Social protocols constitute attributes of work tasks, and are those activities that directly facilitate the interpersonal management of joint work, such as floor control and access control mechanisms. Group characteristics such as size, homogeneity, and collocation versus non-collocation affect how tasks can be performed.

2.4.3 Scenarios

Several different types of descriptions of mocked-up activity can be used during evaluations. At the highest level we have the scenario. This is a high-level description of a collaborative task that subjects might be asked to engage in to test collaborative systems. A fully specified scenario will include the background documentation, instructions, etc. required to have a subject or subjects enact the scenario. Scenarios are often broken down into their constituent tasks. In some cases it will be possible to do evaluations based upon only a subset of the tasks for a given scenario.

Some evaluation goals are better met by precisely repeatable scenarios, perhaps even by automated actors. For this purpose there are scripted scenarios, which are procedural descriptions of scenarios. Scripts can be written at levels corresponding to technologies, services, or capabilities.