FLUD Format and Parser | Download | Installation | Operation | Results

Invocation and Parameters

Parameters for file naming and generation

The first parameter of the invocation designates the name of the website in which the user session to be parsed took place. The second parameter is the name of the session itself. The website and session names determine where input and output files are located. The parser reads in a raw FLUD file, checks it for syntax errors, and generates up to three kinds of output files as a result. Subsequent parameters indicate which of these output files are to be written, e.g. h+ or h- to write or not write the pretty-print HTML file. See table below for a summary of file properties.

Special parameters for u+ parsing

The u+ parameter causes the parser to attempt to generate userpath (*.up) files (one per task within the session) which will be accepted by VisVIP. To do this, the URLs in the .ulog file have to be associated with the URLs in the $WEBLOG_DATA/website/webstruct/url2nn.dat file. This file contains the mapping from full URLs to the nicknames used by VisVIP.

In the usual WebVIP process for generating .ulog files, a copy of the actual website is made and then this site is instrumented. The .ulog file is generated on the instrumented site. The url2nn.dat file, however, may represent either the original website or the instrumented site, depending on which website was analyzed. If the former, you need the host and dirsub parameters to make the association between the original website (encoded in url2nn.dat) and instrumented website (encoded in the *.ulog file).

The parser accepts an optional "host=" parameter, which is used when analyzing URLs within the .ulog file. If absent, the hostname information is taken from the url2nn.dat file. Also, the parser accepts an optional "dirsub=#from_dir#to_dir#" parameter that causes substitution of high-level directory names. It is assumed that there is a commonly named subtree for the original and instrumented site. The dirsub substitution is applied only to URLs which match the hostname. See example below.

Examples

Example 1: File Naming and Generation

   flud-parser.perl  my-website john_37 u+ p+ h-
All files are within the $WEBLOG_DATA/my-website/ directory tree. The command line says to analyze the logfile named "sessions/john_37/john_37.ulog", and generate userpath files ("sessions/john_37/findcity-t1.up", "sessions/john_37/buyticket-t2.up", etc.) and a parse file, (sessions/john_37/john_37.parse") but not an HTML file ("sessions/john_37/john_37.html") - exactly the opposite of the defaults. The url-to-nickname file ("webstruct/url2nn.dat") is used to generate the userpath files.

Example 2: Reconciling the Original and Instrumented Website

Suppose the original website (as recorded in url2nn.dat) is in the directory:
   http://operate.biz.com/external/sales/...  
and the copied, instrumented site (as recorded in the .ulog file) is under:
   http://develop.biz.com/smith/testing/sales/...
Then, the parser might be invoked like this:
   flud-parser.perl  my-website  session_37  u+ \
      host=develop.biz.com                      \
      dirsub=#/smith/testing#/external#
The "#" character used to delimit the directory names is arbitrary; the parser will use the first character, whatever it is, as the delimiter. Obviously, this delimiter character must not appear within the directory names.
Directory Structure The FLUD parser uses the normal Web Metrics directory structure for its input and output files. Specifically, all the files are in the $WEBLOG_DATA/website/ directory (where the value of website is taken from the first invocation parameter). Take a look at the Web Metrics dataflow diagram to get a sense of how the FLUD parser fits within the Web Metrics tool set.

Assume john_37 is the name of the session. The following table shows all the files that are used.

Purpose Direction Example name Controlling
Parameter
Raw FLUD file Input sessions/john_37/john_37.ulog None
Parse file Output sessions/john_37/john_37.parse p+ or p-
Pretty-print file Output sessions/john_37/john_37.html h+ or h-
Userpath files Output sessions/john_37/do_something-t1.up u+ or u-
URL to nickname file Input webstruct/url2nn.dat u+ or u-


Overview |
Installation | Operation | Results | FAQ
Version 1.1
Page last modified: 15 May 2002
National Institute of Standards and Technology (NIST)