de.unidu.is.text
Class ParserFilter

java.lang.Object
  extended byde.unidu.is.text.AbstractFilter
      extended byde.unidu.is.text.ParserFilter
All Implemented Interfaces:
Filter

public class ParserFilter
extends AbstractFilter

This filter splits a string into tokens (by converting all non-letter characters are converted into whitespaces, splitting the resulting string is split into tokens with whitespaces as token boundaries, and considering only tokens with at least 3 characters), converts the tokens into lowercase, computes the stems of the tokens, and removed stopwords.

Since:
2003-07-04
Version:
$Revision: 1.6 $, $Date: 2005/02/21 17:29:28 $
Author:
Henrik Nottelmann

Field Summary
 
Fields inherited from class de.unidu.is.text.AbstractFilter
nextFilter
 
Constructor Summary
ParserFilter()
          Creates a new instance.
 
Method Summary
protected  java.util.Iterator filter(java.lang.Object value)
          Parses the objects.
 
Methods inherited from class de.unidu.is.text.AbstractFilter
apply, apply
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ParserFilter

public ParserFilter()
Creates a new instance.

Method Detail

filter

protected java.util.Iterator filter(java.lang.Object value)
Parses the objects.

Specified by:
filter in class AbstractFilter
Parameters:
value - string to be parsed
Returns:
iterator over tokens
See Also:
AbstractFilter.filter(java.lang.Object)