CHI’02
Workshop Proposal
Automatic Capture, Representation, and Analysis
Of User Behavior
Sharon J. Laskowski
National Institute of Standards
and Technology
100 Bureau Drive, MS 8940
Gaithersburg, MD 20899-8940 USA
+1
301 975 4535
sharon.laskowski@nist.gov
James A. Landay
EECS Department
University
of California
Berkeley,
CA
94720-1776
USA
+1
510 643 3043
Mike Lister
Netusability Limited
500 Chiswick High Road
London W4 5RG UK
+44 20 8956 2445
With the advent of the Web and the refinement of instrumentation and monitoring tools, software user interactions are being captured on a much larger scale than ever before. Automated support for the capture, representation, and empirical analysis of user behavior is leading to new ways to evaluate usability and validate theories of human-computer interaction. It enables remote testing, allows testing with larger numbers of subjects, and motivates the development of tools for in-depth analysis. The data capture can take place in a formal experimental setting or on a deployed system.
The main questions are: can we leverage these capabilities to validate or change our models, to improve the user experience, and to change the user interfaces in products in measurably better ways? How will human-computer interaction (HCI) and usability engineering (UE) as bodies of knowledge and practice change? How has HCI/UE research and practice changed as new analysis, design, and evaluation methods have emerged and been adopted?
Specifically, a number of different approaches based on these methods have appeared in the research literature [3,7] and in commercial tools [1,5,6,9]. However, these have led to a number of unresolved issues under discussion in both the HCI and UE communities, such as how and when to apply these methods, when is remote, automated testing useful, and what can server logs provide.
The goals of this workshop are to encourage discussion of these issues by the HCI and UE research communities, and, as a result, provide a foundation for a clearer understanding and more systematic application of these methodologies. This will be accomplished by:
The survey by Hilbert and Redmiles [3] can be used as a
starting point for identifying the range of techniques. This paper describes
the range of data that can be collected and how the data can be applied to a
set of usability indicators. However,
one of their conclusions is “…that more work is needed in the area of
transformation and data collection to ensure that useful information can be
captured in the first place, before automated analysis techniques…can be
expected to yield meaningful results.” We expand on this observation in the
next section.
Here is a list of some of the issues that we have identified
that need to be resolved to have confidence in automated and semi-automated
techniques. This is not intended as a comprehensive list but to illustrate the
kinds of issues we would like to see discussed at the workshop:
FORMAT OF THE WORKSHOP
Participant Solicitation and Selection
We will solicit the HCI and UE research community through all the usual mechanisms including mail reflectors such as CHI-WEB and UTEST and personal contacts. We are looking for participants with extensive experience in building automated tools or using them for analysis of usability or in empirically validating a research hypothesis about user interaction with software, including, but not limited to, web-based applications. Since we are planning for the workshop to result in a set of papers addressing some of the issues, we expect participants to be willing and able to cooperate in this endeavor. We also expect the participants to be familiar with the work cited here and other similar research. Potential participants will be asked to provide this information:
We suggest 10-15 participants selected on the basis of their familiarity with these issues and their potential to contribute to solutions and follow-up papers.
The workshop is designed as a day-long, highly interactive effort for participants. It is structured around the goal of clearly identifying issues in automation and brainstorming some answers to put more rigor into how these methodologies are applied and then getting this information out to the research communities. Here is the detailed timeline:
8:30- 9:30 Introductions; logistics; agenda
9:30-10:30 Brief examples/demos of state of the art approaches, from the organizers and/or participants. These will be pre-determined based on the participants. Each of the organizers has developed tools which they can present, if appropriate.
10:30-11:00 Break
11:00-12:30 Presentation and discussion of the compilation of issues gleaned from the position papers. 3-6 of these will be chosen for detailed analysis and brainstorming in breakout sessions.
12:30- 1:30 Lunch
1:30- 3:00 Breakouts to brainstorm on issues; 1-2 issues per breakout group. Depending on the issue, results of breakout will either be a detailed analysis of the issues and/or some proposed approaches to determine how to address the issues. For example, a matrix of methods vs. applications or a plan for how to compare/evaluate methodologies with benchmark data could be outcomes.
3:00- 3:30 Break
3:30- 5:00 Reports from breakouts to full group and discussion. Identification of action items and discussion of potential papers and authors.
Pre-workshop Activities
We expect that the participants will read the position papers and any related materials that are relevant. The organizers will put together a public web site on a NIST server with all the information relevant to the workshop. This site will also serve as a repository of information after the workshop.
Plan for Dissemination
The organizers will write up the results of the workshop in a SIGCHI Bulletin paper, identify a publishing venue for the papers that, we hope, will be written after the workshop is held, and organize/edit as appropriate.
Sharon Laskowski is a Computer Scientist and
Manager of the Visualization and Usability Group at NIST where she supervises
projects that involve evaluation methodologies and metrics for both information
visualization and usability engineering. In particular, she directs the NIST
Web Metrics project, which is developing proof-of-concept tools to support
rapid, remote, and automated usability evaluation of web sites. She is also in
charge of the Industry USability Reporting (IUSR) Project, which is creating
and validating a common format for summative user test reporting. Sharon has participated in or
facilitated a number of workshops at CHI and UPA. In 1999 she organized and
hosted the 5th Human Factors and the Web Conference. She has published papers
in various usability, human-computer interaction and visualization conferences
and journals. Sharon received her Ph.D.
in Computer Science at Yale University.
James
Landay is an
Assistant Professor of Computer Science at the University of California,
Berkeley. He is also the CTO and co-founder of NetRaker, a provider of customer
experience evaluation solutions for Web-based applications. He has published
extensively in the area of human-computer interaction, including articles on
user interface design tools, web site evaluation tools, gesture recognition,
pen-based user interfaces, mobile computing, and visual languages. He leads the
WebQuilt project at Berkeley. WebQuilt allows web designers and usability
practitioners to easily run remote usability tests on web sites. He has published
papers on this topic at the World Wide Web conference and in ACM Transactions
on Information Systems. While he has participated in a number of workshops at
CHI, he has not organized a workshop before.
He received his Ph.D. in computer science from Carnegie Mellon
University.
Mike
Lister
graduated from Kingston University with a BA in Graphic Design and acquired a
wide range of art direction and studio experience working in a variety of different
projects in both design and advertising. He founded Safelight twenty-five years
ago. Safelight became an early pioneer
of computer graphics systems. In 1995
Mike Lister formed part of a committee that advised the UK Government on future
strategy for the country concerning the Internet, the outcome of which was the
Information Society Internet Initiative.
Mike is now a director and Chief Technology Officer of Netusability
Limited. At CHI2001 Mike gave a
demonstration of technology as well as participated in a panel on Market
Research and Usability. Mike Lister is also a life member of the British
Kinematograph Sound and Television Society (BKSTS), a Member of the British
Photographic Export Group (BPEG) and a Member of the Usability Professionals
Association (UPA).
Disclaimer NIST does not recommend or
endorse commercial products.
1. Chak, A., Usability Tools: A Useful Start, Web Techniques, August, 2000. Also available at: http://www.webtechniques.com/archives/2000/08/stratrevu/
2. Cugini, J. and Laskowski, S., Design of a File Format for Logging Website Interaction, NIST SP 500-248, April, 2001, Also available at: http://www.itl.nist.gov/iad/vug/cugini/webmet/flud/design-paper.html
3. Hilbert, D.M. and Redmiles, D.F., Extracting Usability Information from User Interface Events, in ACM Comput. Surv. 32, 4 (Dec. 2000), Pages 384 – 421. Also available at http://www.fxpal.com/people/hilbert/papers.html
4. Hong, J.I., Heer, J., Waterson S., and Landay, J.A., WebQuilt: A Proxy-based Approach to Remote Web Usability Testing. To appear in ACM Transactions on Information Systems. Also available at http://guir.cs.berkeley.edu/pubs/#webquilt
5. Lister, M., Usability Testing Software for the Internet, in Extended Abstracts of CHI '01 (Seattle, WA, April 2001), ACM Press, 17-18.
6. Moore, P., Conducting Experimental Research using Native Site Visitors, in IBM Make It Easy 2000 Conference at http://www-3.ibm.com/ibm/easy/eou_ext.nsf/Publish/1822
7. Pirolli, P., Card, S.K., and Van Der Wege, M.M., Visual Information Foraging in a Focus + Context Visualization, in Proceedings of CHI '01 (Seattle, WA, April 2001), ACM Press, 506-513.
8. Spool, J., Testing Web Sites: Five Users is Nowhere Near Enough, in Extended Abstracts of CHI '01 (Seattle, WA, April 2001), ACM Press, 285-286.
9. Vividence white paper, The Vividence Approach and Methodology. Available at http://www.vividence.com/public/Research/methodology.htm
250 WORD ABSTRACT FOR THE CALL FOR PARTICIPATION
Automatic Capture, Representation, and Analysis of User Behavior
We can now capture software user interaction on a much larger scale than ever before and, as a result, new approaches for evaluating usability and validating theories of computer-human interaction are being developed. The main questions are: can we leverage all this capability to validate our models, to improve the user experience, and to change the user interfaces in products in measurably better ways? How will human computer interaction (HCI) and usability engineering (UE) as bodies of knowledge and practice change? How has HCI/UE research and practice changed as new analysis, design, and evaluation methods have emerged and been adopted. Unresolved issues related to these new methodologies are under discussion in both the HCI and UE communities, such as how and when to apply methods, when is remote, automated testing useful, and what can server logs provide. The goals of this workshop are to encourage researchers to exchange ideas on how to address these issues and provide a foundation for a clearer understanding and more systematic application of these methodologies. We are looking for participants with experience in building or using automated tools for analysis of usability or in empirically validating research hypotheses about user interaction. We expect participants to be willing to contribute to papers that result from this workshop. Potential participants are asked to note in their position papers: relevant experience, issues they have encountered and would like to address, and suggestions for the type of paper they would like to author or co-author. Send position papers to: sharon.laskowski@nist.gov
Automatic Capture, Representation, and Analysis
Of User Behavior
Sharon J. Laskowski
National Institute of Standards
and Technology
100 Bureau Drive, MS 8940
Gaithersburg, MD 20899-8940 USA
+1
301 975 4535
sharon.laskowski@nist.gov
James A. Landay
EECS Department
University
of California
Berkeley,
CA
94720-1776
USA
+1
510 643 3043
Mike Lister
Netusability Limited
500 Chiswick High Road
London W4 5RG UK
+44 20 8956 2445
Automation, evaluation, log analysis, validation, remote testing, usability
With the advent of the Web and the refinement of instrumentation and monitoring tools, software user interactions are being captured on a much larger scale than ever before. Automated support for the capture, representation, and empirical analysis of user behavior is leading to new ways to evaluate usability and validate theories of human-computer interaction. It enables remote testing, allows testing with larger numbers of subjects, and motivates the development of tools for in-depth analysis. The data capture can take place in a formal experimental setting or on a deployed system.
The main questions are: can we leverage these capabilities to validate or change our models, to improve the user experience, and to change the user interfaces in products in measurably better ways? How will human-computer interaction (HCI) and usability engineering (UE) as bodies of knowledge and practice change? How has HCI/UE research and practice changed as new analysis, design, and evaluation methods have emerged and been adopted?
Specifically, a number of different approaches based on these methods have appeared in the research literature [3,7] and in commercial tools [1,5,6,9]. However, these have led to a number of unresolved issues under discussion in both the HCI and UE communities, such as how and when to apply these methods, when is remote, automated testing useful, and what can server logs provide.
The goals of this workshop are to encourage discussion of these issues by the HCI and UE research communities, and, as a result, provide a foundation for a clearer understanding and more systematic application of these methodologies. This will be accomplished by:
The survey by Hilbert and Redmiles [3] can be used as a
starting point for identifying the range of techniques. This paper describes
the range of data that can be collected and how the data can be applied to a
set of usability indicators. However,
one of their conclusions is “…that more work is needed in the area of
transformation and data collection to ensure that useful information can be
captured in the first place, before automated analysis techniques…can be
expected to yield meaningful results.” We expand on this observation in the
next section.
Here is a list of some of the issues that we have identified
that need to be resolved to have confidence in automated and semi-automated
techniques. This is not intended as a comprehensive list but to illustrate the
kinds of issues we would like to see discussed at the workshop:
· Depth vs. breadth: semi-automated user testing usually implies analyzing large datasets. But, perhaps the need for automation is overstated and skilled testers should do user testing individually, at least in the case of evaluating a software application. However, for validating a theory automation can be valuable. See, for example, how the collection of eye tracker data and visualization to analyze the data are helpful in validating the CTVA-foraging theory of how a user interacts with focus+context information visualizations in [7].
· In general, can the behavioral data be used to support or disprove theories of behavior, such as ACT-R, Soar, EPIC, Activity Theory, Foraging Theory, Behaviorism, etc.?
· What is lost as compared to traditional usability testing methods? Is an onsite observer essential for a semantically deep description of the experience? When is an observer not essential? This is an especially important issue for testing web usability remotely. With WebQuilt [4] and Enviz [6] it is assumed that some useful data will be collected without an observer, while Netusability’s technology [5] and NetRaker’s [1,6] technology captures some of the user reaction with a camera or chat window as part of the remote data capture. Server logs are used by some; they have severe limitations, but when are they useful?
· How many test participants are required? Recent results such as those in Spool [8] suggest that many more users than the traditionally accepted 5-8 are required under some circumstances when evaluating web sites. On the other hand, what do you do when you have a huge amount of data on user interaction? Can you mine it effectively?
· Is it feasible to develop one or more standard representations to allow data exchange and development of generic tools? For example, a standard format for user logs, such as that described in [2], will enhance interoperability among analysis tools. But, can we find a single format to cover a wide range of user testing methodologies? What can be done to support mapping of low-level system and user events into higher-level descriptions? What about capture of the part of the system’s behavior that is apparent to the user, e.g., windows opening and closing, and the status of checkboxes? What context information needs to be captured?
· Can you effectively automate the tracking and managing of the web customer experience as in the approach described in [9]? This approach connects user satisfaction to click stream and page view user behavior for large sample populations. Can you infer the quality of the customer experience and how do you use this data for improving a web site? How does it compare to more traditional methods?
· Are there differences in data capture and analysis depending on whether data is supporting HCI research or usability evaluation?
· Which of the innumerable aspects of the user behavior should be captured? Mouse clicks? Eye gaze? Verbal self-reports and think-alouds? What can be done to automate the capture of this data?
· There are a number of technical issues relating to web-based applications. For example, can we abstract away from browser-specific event models? Is the Document Object Model (DOM) the answer? What is the best tool architecture for data capture? Server-side instrumentation, customized browsers on client-side, or proxies in between the two?
· What analysis and visualization tools are useful to researchers and usability engineers?
· Is there a methodology for benchmarking approaches so that there is some assurance that the automated tools are indeed measuring usability?
·
What about privacy concerns? Should users’ behavior
ever be monitored without their explicit consent?
Disclaimer NIST does not recommend or
endorse commercial tools.
1. Chak, A., Usability Tools: A Useful Start, Web Techniques, August, 2000. Also available at: http://www.webtechniques.com/archives/2000/08/stratrevu/
2. Cugini, J. and Laskowski, S., Design of a File Format for Logging Website Interaction, NIST SP 500-248, April, 2001, Also available at: http://www.itl.nist.gov/iad/vug/cugini/webmet/flud/design-paper.html
3. Hilbert, D.M. and Redmiles, D.F., Extracting Usability Information from User Interface Events, in ACM Comput. Surv. 32, 4 (Dec. 2000), Pages 384 – 421. Also available at http://www.fxpal.com/people/hilbert/papers.html
4. Hong, J.I., Heer, J., Waterson S., and Landay, J.A., WebQuilt: A Proxy-based Approach to Remote Web Usability Testing. To appear in ACM Transactions on Information Systems. Also available at http://guir.cs.berkeley.edu/pubs/#webquilt
5. Lister, M., Usability Testing Software for the Internet, in Extended Abstracts of CHI '01 (Seattle, WA, April 2001), ACM Press, 17-18.
6. Moore, P., Conducting Experimental Research using Native Site Visitors, in IBM Make It Easy 2000 Conference at http://www-3.ibm.com/ibm/easy/eou_ext.nsf/Publish/1822
7. Pirolli, P., Card, S.K., and Van Der Wege, M.M., Visual Information Foraging in a Focus + Context Visualization, in Proceedings of CHI '01 (Seattle, WA, April 2001), ACM Press, 506-513.
8. Spool, J., Testing Web Sites: Five Users is Nowhere Near Enough, in Extended Abstracts of CHI '01 (Seattle, WA, April 2001), ACM Press, 285-286.
9. Vividence white paper, The Vividence Approach and Methodology. Available at http://www.vividence.com/public/Research/methodology.htm