TreeDec: an Annotation Tool to Support Website Navigation

John Cugini / cuz@nist.gov
Information Technology Laboratory
National Institute of Standards and Technology (NIST)


Contribution of the National Institute of Standards and Technology. Not subject to copyright. Reference to specific commercial products or brands is for information purposes only; no endorsement or recommendation by the National Institute of Standards and Technology, explicit or implicit, is intended.

Abstract

Websites are often organized into logical hierarchies, or tree structures, in order to help users navigate. Ideally, users could view the entire tree, or jump to nearby pages. TreeDec (= Tree Decorator) is a system to support website authors and maintainers by providing automatic annotation of webpages under the control of a central file that represents the tree structure.

Keywords

Breadcrumbs; Hierarchies; Tree structures; Websites; Website navigation;

1. Background and Motivation

The users of a website need to have a sense of where they are within the website, and how to get from the current location to other relevant pages. So-called "breadcrumbs" (as defined in [Nielsen]) on a webpage allow the user to see logical context and navigate. For example, a page that lists putters for sale might have a heading like:
   All > Recreation > Sports > Golf > Putters
where each of entries is a clickable link. Breadcrumbs are really an ancestor path (the term we will use henceforth) - a list of parent, grandparent, etc. showing the logical subtree in which the current page lives.

Decorating each page of a website with its ancestor path correctly and consistently is a tricky and boring task if done by hand. Furthermore, the overall tree logically implicit in the paths may not be shown explicitly, even though such an overview could be valuable as a sitemap or table of contents.

The purpose of TreeDec is to make it easier for the website designer to maintain a consistent set of navigation aids on the pages of the website. Instead of directly manipulating all of the pages, the necessary information about the site's hierarchical structure is collected into a single file, which TreeDec uses to decorate the entire website.

TreeDec is one of a suite of tools called NIST Web Metrics [Webmet]. The objective of the NIST Web Metrics Testbed is to explore the feasibility of a range of tools and techniques that support rapid, remote, and automated testing and evaluation of website usability. For detailed information on downloading and operating TreeDec, please visit the Web Metrics site (http://www.nist.gov/webmetrics/).

2. Related Work

There are other systems whose goal is to provide website users with navigation aids beyond the links directly built in by the author. Several of these systems analyze the link structure and provide a clickable visualization. Other systems impose a tree structure on the website. Having done so, they then make that entire structure available to the user, either via a visualization or as a webpage.

TreeDec is unique in that its purpose is to provide navigation to all nearby tree nodes directly on each web page, not to provide a global view (although it does generate a Table of Contents webpage). The table below summarizes the various approaches.

System Source of tree structure Navigation Aid(s)
WebTOC Link structure TOC within a page frame (global overview)
InXight Link structure Hyperbolic tree visualization (adjustable overview)
Breadcrumbs for
Those Using ASP
Directory structure Decorates dynamically generated webpages with ancestor path.
Web Node Structure (wenost) Specified by User Decorates static webpages with pointers to parent, first child, next sibling, previous sibling
TreeDec Specified by User or Link structure Decorates static webpages with ancestor path, all siblings, all children. Also generates Table of Contents webpage.

3. TreeDec System

The primary purpose of TreeDec is to allow users of a website to picture it as a tree. This is a familiar and easily understood structure. It should be emphasized that the logical organization of the pages of a website into a tree is independent of other organizational structures, in particular: TreeDec allows website authors and designers to impose a tree structure which does not depend directly on any of these other mechanisms. This tree structure is intended to be the logical view provided to the users of the website. The author maintains the logical tree structure in a single tree file and uses that file to control the systematic addition of navigation aids to all the webpages of the site.

3.1 The Tree File

The tree file anchors the TreeDec system. It: Each record of the tree file consists of some indentation, implicitly showing its position in the tree, the filename of the page to be annotated, and the title to be used in that annotation. The indentation consists of zero or more TAB characters, so that the appearance of the file naturally reflects the structure being imposed. In the special case where the title field of the record is "*", TreeDec will attempt to retrieve the title from the <title> tagged entry of the HTML file, instead of getting it explicitly from the file. Thus, the tree file contains all the information about which webpages are to be decorated, and how they are to be related to each other.

The question then becomes: how to generate this tree file? To the extent that the tree structure of the website is designed, it cannot be automatically generated from other sources. It is perfectly valid to create the tree file manually, using a simple text editor. Part of the purpose of TreeDec is to allow the author this freedom of design, rather than trying to derive the structure in a fully automatic way.

3.2 Support for Generation of Tree File

The TreeDec system, however, provides pre-processing utilities to automatically generate a first approximation of the tree file to serve as a basis for precise editing by the designer. This approximation is based on the following premises: The publicly available software Linklint performs a complete analysis of the static link structure of a website. Then the GenTree utility uses this analysis to generate a valid tree file from a designated root page.

Note that this is not guaranteed to generate a perfectly satisfactory logical tree. Links within pages may skip over logical levels, the order of siblings may not be correct, and so on. The tree file is reasonably comprehensible, however, and may be further customized by hand to achieve the desired result.

3.3 Configuration file

The configuration file controls how TreeDec goes about its decoration chores. Each record in the file is essentially a keyword=value parameter. It controls such things as:

3.4 Website Decoration

TreeDec processes the webpage files, systematically adding a set of links to each. This allows the website user to navigate to other pages nearby in the tree. Specifically, TreeDec can generate three sets of related links: Here is an example of a navigational table:

TreeDec will delete earlier TreeDec decorations, so it can be run against either an undecorated or previously decorated website.

3.5 TOC output

In addition, a very simple table of contents file is created (essentially mirroring the tree file), giving the user an overview of the entire website.

3.6 TreeDec Implementation

The components of TreeDec are written in PERL. TreeDec assumes the existence of a parent directory. This directory:

Here is an outline of the logic of TreeDec:

3.7 TreeDec Example

For a geography-oriented example of the output of TreeDec, please visit http://zing.ncsl.nist.gov/WebTools/TreeDec/example/a101.html to see a typical root page and visit http://zing.ncsl.nist.gov/WebTools/TreeDec/example/td-toc.html for the table of contents. Note that this website has no actual content - it just illustrates the navigation aids.

4. Open Issues

TreeDec currently handles static pages only; there is no support for dynamically generated pages. TreeDec could probably be extended to work with static template pages (which would then be filled in with dynamic content) as long as the hierarchical relationship of these could be specified in advance.

Also, it would be useful to have more empirical research to determine how and to what extent these navigational aids help users, the most favorable page placement for them, which type (ancestors / siblings / children) are used most often, and so on.


References

[Linklint]
http://www.linklint.org.
[Nielsen]
Jakob Nielsen, "Designing Web Usability: The Practice of Simplicity", New Riders, Indianapolis, Ind. 2000.
[Webmet]
http://www.nist.gov/webmetrics/