Ubicomp is attempting to bring together a number of research communities that have been focusing on various aspects of ubiquitous computing. While it is obvious that researchers from these various communities will need to coordinate and communicate to achieve the goal of ubiquitous computing it does not diminish the difficulty of the task. We propose that one way to bring about a larger sense of community is through the development of evaluation programs. This includes participation of the researcher community in determining what to test, how to test, and appropriate metrics to use. This activity helps to focus the research community and also helps to show progress in the area. The overall view of ubiquitous computing is overwhelming but through focusing on specific instantiations of ubiquitous computing and developing scenarios to demonstrate these instantiations, the research communities will have a common goal. Focusing efforts on specific directions will be able to give the community direction and will be instrumental in demonstrations of success. As success is achieved in these focused areas, the community can evolve the evaluation in new directions, based on lessons learned. The value of evaluations can be demonstrated by the Text Retrieval Conference (TREC) activities. The information retrieval community has conducted nine years of evaluations - outlasting the research program for which this evaluation was created. The TREC evaluation has grown to include new tracks each year and has helped the community to analyze and improve information retrieval algorithms (http://trec.nist.gov).
Interactive systems, and in particular, ubiquitous computing pose more complex evaluation methodologies than non-interactive text retrieval. However, there are several possibilities. One possibility is to start by conducting evaluations on the various aspects that make up subsystems of ubiquitous computing: perceptual user interfaces, dynamic service discovery, wireless networking services, distributed data systems and input and output using distributed user interfaces. Issues here would be to establish metrics and evaluation methodologies for individual components and to determine what, if anything, successful evaluations imply about the entire system.
Another possibility would be to evaluate individual systems, end -to-end, as they are built using traditional usability evaluation methodologies. This would give us information about individual systems and would perhaps allow researchers more flexibility in choosing particular domains. Would this necessitate changes in usability evaluation methodologies? Typical usability metrics are effectiveness, efficiency, and satisfaction. Are these valid metrics for technology that is still under development? Is it feasible to compare systems across domains of use?
Another issue for evaluation of ubiquitous computing systems is that we would like to employ rapid evaluation methods whose results will be available in time to influence the final design. Creation of such methods and their validation would represent an important contribution to the HCI community.
As the goal of ubiquitous computing is to have the computer disappear, one possible metric could be the amount of distraction necessary for the user to complete a task. Smailagic et. al (A.Smailagic, D.P.Siewiorek, J.Anhalt, F.Gemperle, D.Salber, S.Weber, J.Beck, J.Jennings. Towards Context Aware Computing: Experiences and Lessons Learned, IEEE Journal on Intelligent Systems, accepted, 2001) have proposed a distraction matrix in which they classify the time needed for a distraction (from a snap (extremely small amount of time) to an extended time) and look at the categories of information, communication, and creation with respect to these distractions. It seems feasible to put together some scenarios for different cells within the matrix that could be used for evaluation. It has been suggested that reference tasks such as this would help the HCI community in general ("Let's Stop Pushing the Envelope and Start Addressing It: A Reference Task Agenda for HCI", Whittaker, Terveen, and Nardi, HCI, vol 15 (2&3, 2000, 75-106). The question to address would be the selection of representative primitive tasks.
A.Smailagic, D.P.Siewiorek, J.Anhalt, F.Gemperle, D.Salber, S.Weber, J.Beck, J.Jennings. Towards Context Aware Computing: Experiences and Lessons Learned, IEEE Journal on Intelligent Systems, Vol. 16, No. 3, June 2001 issue. The paper is available at here
|Mirjana Spasojevic, HP Lab
Tim Kindberg, HP Lab
|Chris Quintana, University of Michigan||Position_Paper.doc||PowerPoint Presntation|
|Larry Arnstein, University of Washington
Jong Hee Kang, University of Washington
Gaetano Borriello, University of Washington, Intel Research
|Heather Richter, Georgia Institute of Technology
Gregory Abowd, Georgia Institute of Technology
|Mark Burnett, Dept. of Defence, Australia
Chris P. Rainsford, Dept. of Defence, Australia
|Craig H. Ganoe, Virginia Tech
John M. Carroll, Virginia Tech
|Prithwish Basu, Boston University
Wang Ke, Boston University
Thomas D.C. Little Boston University
|Paul Castro, UCLA
Ted Kremenek, UCLA
Richard Muntz, UCLA
| Anind K. Dey, Intel
|Kaj Makela, Univ. of Tampere, Finland||Ubicomp_WOz_Makela.pdf||PowerPoint Presntation|
|Christopher A. Miller
Harry B. Funk
Matrix for Evaluation of Ubiquitous Computing Workshop (MS word format)
Matrix for Evaluation of Ubiquitous Computing Workshop (HTML format)
Jean Scholtz, Marty Herman, Sharon Laskowski, National Institute of Standards and Technology; Asim Smailagic and Dan Siewiorek, Carnegie Mellon University
Jean Scholtz, Ph.D National Institute of Standards and Technology 100 Bureau Drive, Stop 8940
Gaithersburg, MD 20899-8940
Future plans and volunteer opportunities