 |
java-unidu
|  |
Common Java classes for IR from our group
The inofficial projects hosts common Java classes developed by
and used in our group. These classes mainly deal with IR
issues, and include the PIRE retrieval
system. The project is available as Open Source Software under
the Apache Licence 2.0, which allows easy integration into
other projects.
We also have a public
mailing list for discussing everything around "java-unidu"
and advertising new releases. PIRE has its own public
mailing list.
Our project "java-unidu" contains code for the following
application areas (we only list some of them, see the source
code and the JavaDocs for others):
- IR engine PIRE:
-
One of the major components of java-unidu is PIRE, an extensible, logic-based
probabilistic indexing and retrieval engine. PIRE can be
extended for performing simple XML retrieval by
implementing a generic interface for
IR engines (accepting XML documents and XIRQL queries), so
that different IR engines can be called with the same
code (besides PIRE, HyREX be used
by this).
- Property maps:
-
Property maps are an
extension to the map implementations
in the Java API. Values can be set and retrieved as
strings, ints, longs, doubles and booleans. Values can
also reference other values; and the property map can be
configured to store more than one value for a key. The
package contains classes for saving and loading maps from
streams and files as well.
- Text filters:
-
Text filters are used to modify
objects (in most cases, strings) in a uniform
way. Currently there are filters for parsing text, for
splitting text into tokens, for stemming and stopword
removal, for removing tags and for couting terms.
- General utility classes:
-
This part contains classes for
using different character encodings, for string
processing, for locating files, for managing collections,
and some other features which are not documented here.
- IR evaluation:
TBA
- GUI:
TBA
- Database support:
PIRE has support for connecting to databases (class
de.unidu.is.util.DB) and for formatting general
SQL statements.
- pDatalog++:
The pDatalog++ implementation is described under PIRE,
see here.
- General expressions:
The general expressions are described under PIRE,
see here.
- Gnuplot connection:
The class de.unidu.is.gnuplot.Gnuplot
allows for using GnuPlot for plotting and learning parameters
from Java. The plotting can be used without knowledge of the
GnuPlot syntax.
- Parameter learning:
Parameters of functions can be learned via de.unidu.is.learning.LearnerFactory.
Software
- java-unidu from 2005-03-20
- The latest java-unidu release
- java-unidu from 2005-03-09
- The first java-unidu release
|