|
VUG Home Page
|
Analysis begins when the website under consideration is processed by the Linklint software, which determines its static link structure. This information is written to several files, one of which (named fileF.txt), is then further processed by the PERL script convert-ll. Here are the details:
Website:
|
Linklint can analyze either the original website or the copied instrumented version. We will concentrate on the latter, since it simplifies later steps. We assume that the website resides in $WEBLOG_DATA/site-2d-tested/website/. |
Run Linklint |
Read the full documentation of linklint to find out about
all the options. The following is a suggested procedure.
Use a text editor to build a command-line file in $WEBLOG_DATA/site-2d-tested/webstruct/, named something like cmdline.dat. This file will contain parameters (one per line) to be used by linklint. A sample cmdline.dat file might be: -http -host www.yourserver.com -net -doc . -limit 8000 -xrefThis says to use the http protocol to access webpages, gives the hostname of the website, and specifies that results are to be written into the current directory. Change your current directory to $WEBLOG_DATA/site-2d-tested/webstruct/ and invoke linklint something like this: linklint @cmdline.dat $WEBLOG_DATA/site-2d-tested/website/@This command line says to use the cmdline.dat file for control parameters, and points to one or more seeds (in this case, just the files within the "website" directory) to be scanned within website. As a result, a number of informative files are written within the webstruct directory. Note especially the error files (such as error.html), as these may indicate needed repairs to the web pages. For our purposes, the important one is fileF.txt. This file contains complete information about all the web pages in the directory and how they are inter-linked. It also contains information about external (to the website) pages that are directly linked to from within the website. |
convert-ll |
Assuming your current directory is still
$WEBLOG_DATA/site-2d-tested/webstruct/,
you can invoke convert-ll.perl something like this:
$WEBMET_TOOLS/resources/weblink/convert-ll.perl dir=.This causes convert-ll to read fileF.txt and then generate three output files in the current directory:
$WEBMET_TOOLS/resources/weblink/convert-ll.perl indir=here outdir=therewould cause the input to be read from here/fileF.txt and the output written to there/pages.dat, there/links.dat, and there/url2nn.dat. |
| Version 3.0 Page last modified: 15 May 2002 National Institute of Standards and Technology (NIST) |