The file `freeWAIS-sf-2.0.tar.gz' is a
archive compressed with GNU
gzip. Unpack it with the following
gunzip -cd freeWAIS-sf-2.0.tar.gz | tar xvf -
The directory structure is as follows:
|-FIELD-EXAMPLE | |-bin | |-ctype | |-CNIDR | | | |-SF |-doc-----------| | |-original-TM-wais-|-manl | |-freeWAIS-sf-2.0-|-ir | |-lib | |-regexp | |-ui | |-x
The directories contain
The distribution comes with a
Configure script generated by
metaconfig which you should run next. It determines various system
properties and asks you some questions.
Configure generates some files:
#includestatements by most C files.
*.SHfiles when producing the corresponding files. You may change settings in `config.sh' and rerun
Configurewith the options
Configure by submitting the following to your shell:
Configure will display a startup message and begin to check the
required system properties.
Intermixed with questions about variaus system properties there are some questions about the desired properies if the freeWAIS-sf system itself. Theese are explained in little more detail below.
Configure prints the name of the files it generates and
some short advice how to proceed.
If your system has support for regular expression, the first question will be:
... Checking if your systems regexp implementation works ... Yes, it works Do you want to use your systems regexp.h? [n]
It is usually save to accept the default `no' and use the regular expressions included in the distribution. If that does not work (this maybe the case with IRIX) try your systems regular expressions by answering `yes'. In fact every answer other that `no' will be interpreted as `yes'.
The next question refers to the length of the headline file. If you will not have large databases, answer `no' here. To estimate the size of your headline file, multiply the number of documents you want to index with the medium headline length. If this size fitz in three bytes, it is save to accept the default.
Will you have HEADLINE files greater than 16 MB [n]
`config.h': bool MYREGEXP
If the answer is different from `no', a
MYREGEXP is added to `config.h'.
Next decision selects one of two possible query extensions.
This version of freeWAIS-sf supports new proximity operators by Tom Snee. He also fixed the string search code. You can now enable them at the cost of dropping the string search capability. There is currently no description of the proximity operators. Have a look in ir/query_l.l to learn about them; Use proximity instead of string search? [n]
`config.h': bool PROXIMITY
If the answer is different from `no', a
PROXIMITY is added to `config.h'.
Configure now asks for the character set, you want to use.
All characters for which your
assumed to be legal within words.
You can augment this character set by additional chars. If you want `C++' to be a valid word, you must make `+' an allowed character. This mechanism is usually used to add country specific ISO characters to the default ASCII set.
You can compile freeWAIS-sf with it's own ctype package. You should do this, if you want to use special (country specific) chars, which are not supported by your systems ctype. Use your systems ctype? [n]
`config.h': bool MYCTYPE
If the answer is different from `no', a
MYCTYPE is added to `config.h'. This also leads to the
inclusion of the `ctype' directory in the building process. So a
-I../ctype is added to the compiler flags.
The following question is asked only if the default is accepted.
I will now ask for your special letters. If you do not want to give the now, edit config.h after this Configure run. Input your upper case letters in the same order than your lower case letters. toupper() and tolower() depend on this order. Input letters which are upper and lower case in both strings.
`config.h': string LCHARS
`config.h': string UCHARS
The character strings are added as extension of
UCHARS. Note that the order of character matters, since the
index position is used for the macros
You can compile and link the clients with the capability to search index files directly. So you need not to install a server, for local searches. The clients will be greater, but faster with local searches.
You will have to run the test (see section Running the tests) manually without the switch!
Do you want to compile with -DLOCAL_SEARCH? [y]
`config.h': bool LOCAL_SEARCH
Server code is linked to the clients when true.
You can modify the URL document type, to put the URL of the indexed document in the document id instead of the headline. If you use this modification you can customize the headline (e.g. with the `-t fields' option). Also it is not required to keep a copy of the documents for retrieval with the WAIS server. But currently only SFgate can handle this modified document-ids. Normal clients will not correctly interpret the document-ids and try to retrieve the document from a WAIS- instead of the corresponding HTTP server. See section `Overview' in The SFgate Manual.
Do you want to use the modified URL handling? [y]
`config.h': bool URLDOCID
The DocumentId is used to carry the URL to the client. The server will be unable to access the document. You can remove it after indexing.
Installation prefix to use? (~name ok) [/usr/local/ls6/wais]
I added a patch from Alberto Accomazzi for speeding up usage of synonym files. He writes about his patch:
For those of you who have fairly large synonym files (> 10Kb) and are running the software on a machine that supports shared memory (your machine does) enabling this feature will speed up the waisserver response time by a significant factor.
For those of you who do not have shared memory, I have rewritten the memory allocation part of `synonym.c' so that bigger memory chunks are allocated and used rather than allocating memory for each word and synonym, so the code should be a little faster for you too.
Do you want to use shm cache? [n]
As distributed, freeWAIS-sf Release 2.0 will send an UDP packet to my workstation every time waisserver reindexes his info database, containing your (numeric) UID, your operating system, your compiler version and the freeWAIS-sf version.
This is just because I would like to get an idea of to which systems/compilers freeWAIS-sf has be ported and how many people use it, and keep on using it (rather than tried it once folks). It will never become a licensing scheme or some crazy thing like that. But, you can disable it by answering `yes' to the following question. If you do that, please let me know, if you are running freeWAIS-sf on a system/compiler, which is not mentioned in the `README'.
Disable the UDP packet sending? [n] Ok. Thank you for your trust.
When you have generated the makefiles using
(see section Running
Configure), you can compile the system by just entering
`make all'. This will also index a test database and run some
tests. Here is the list of the make targets:
If you need the x clients
cd to the x subdirectory and enter
make depend prior to
make all and
After successful compilation the
make process indexes a test
database and runs some queries. You will have to start the tests
manually, if you did not chose the
option. See section
cd FIELD-EXAMPLE ../ir/waisserver -d . -p 9565 make test ... kill waisserver-pid
The results of the queries are compared to expected results contained in
the distribution. Depending on your choices during the configuration
(see section Running
Configure) some of the tests will fail. These are mentioned
imake makefiles will avoid the test which are expected
Here is the list of tests:
If some of the tests supposed to work fail, have a look at the `FIELD-EXAMPLE' directory. For each test there is a pair of file: `testname.is' and `testname.should'. `testname.should' is what the query result was on my machine and `testname.is' was produced running on your system. Compare the files and decide, if you can live with the differences, e.g. if only the weighting differs, you should be able to use the system.
If the tests where successful, you can install the system using `make install'. See section Compiling the system and running the tests.
This is how your installation directory should look like after installation. See section Installation Prefix.
|-SFproxy |-catalog |-check-sources |-dictionary |-getaddrs |-inverted_file |-makedb |-mkfmt |-server_stats |-stats.awk |-stringtoany |-swais |-trunc |-wais-gif-display |-bin----------|-wais-html-display | |-wais-jfif-display | |-wais-jpeg-display | |-wais-pict-display | |-wais-ppm-display | |-wais-tiff-display | |-waisindex | |-waisping | |-waisq prefix-| |-waisretrieve | |-waissearch | |-waisserver | |-ws | |-xwais | |-xwaisq | | |-X11---------------|-app-defaults----|-Xwais |-lib----------|-XwaisHELP | |-XwaisqHELP | | | |-emacs-------------|-lisp------------|-wais.el | | |-catalog.1 | |-dictionary.1 | |-inverted_file.1 | |-makedb.1 | |-mkfmt.1 | |-man1--------------|-waisindex.1 |-man----------| |-waisq.1 | |-waissearch.1 | |-waisserver.1 | |-xwais.1 | |-xwaisq.1 | |-man3--------------|-ftw.3 |-scandir.3
MANPATH. Note that the manuals are out of date!
To install the server for being started by
inetd proceed as
follows. The description applies to SUNOS 4.1.3 but should be similar on
other *NIX systems.
You do not have to install a server, if your clients are compiled and
linked with the
LOCAL_SEARCH (see section
on. Without a server only clients which have access to the database
files can do searches.
Add a line for the WAIS protocol to your `/etc/services':
You can use any term instead of `wais' if you want. But `z3950' is misleading (see section The WAIS Protocol). You can choose a port different from 210 if you want. Make sure, that your database descriptions contain the right port then (see section The Database Description).
To `/etc/inetd.conf' add a (one!) line:
wais stream tcp nowait wais \ installdir/bin/waisserver waisd.d -d databasedir -e logfile
Send the inet daemon a `HUP' signal:
kill -HUP inetd-pid