Indexing




Next: Document Specification Syntax Up: FreeWAIS-sf Previous: Beta 05

Indexing

If you want to index a collection of files containing one or more documents using FreeWAIS-sf first look at the supported document type formats. You may look at the manual page of waisindex or type waisindex without arguments for information about supported document types.

If your document object is one of the supported types, run the waisindex command with the
t -t argument:

waisindex -d index_file_root_name -t doc_type object object ...

where:

-d
denotes the rootname to be used for the collection of index files and will include suffixes created by the waisindex program
-t
denotes the document types supported by the waisindex command
object
is the file name of a target object to be indexed by the command.

Both the -d and object specifications support full pathnames and default to the current directory if no pathnames are provided.

If you have a document in an unsupported format or would like to split individual documents into fields, you must generate two document format files.

First you should decide which fields you will use, and what their name should be. Usually it is a good idea to provide further information about what the fields contain or mean. The field definition file <database>.fde contains this information. Here is an example:

py: publication year
au: author
ti: title
jt: journal title
ck: citation key

Waisindex will put the names of the generated fields in the server description (<database>.src) it will produce if a field definition file is encountered.

Now comes the hard part. You now have to generate a format file <database>.fmt for your new database. Look at the examples on our ftp server if the following is too obscure.

The abstract syntax for the specification files follows:

___________________________________________________




Next: Document Specification Syntax Up: FreeWAIS-sf Previous: Beta 05

___________________________________________________

Ulrich Pfeifer
Thu May 25 16:37:04 MET DST 1995