|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectde.unidu.is.text.AbstractFilter
de.unidu.is.text.WordSplitterFilter
This filter splits a string into tokens. First, all non-letter characters are converted into whitespaces. Then, the resulting string is split into tokens (the whitespaces are the token boundaries). Only tokens with at least 3 characters are returned.
| Field Summary |
| Fields inherited from class de.unidu.is.text.AbstractFilter |
nextFilter |
| Constructor Summary | |
WordSplitterFilter(Filter nextFilter)
Creates a new instance and sets the next filter in the chain. |
|
WordSplitterFilter(Filter nextFilter,
int length)
Creates a new instance and sets the next filter in the chain. |
|
| Method Summary | |
protected java.util.Iterator |
filter(java.lang.Object value)
Applies only this filter on the specified object, without considering the other filters from the filter chain. |
int |
getLength()
|
protected void |
handleBuffer(java.lang.StringBuffer buffer)
Handles the specified buffer before splitting it into tokens. |
boolean |
isAllowDigits()
Returns whether digits are allowed in the output. |
void |
setAllowDigits(boolean allowDigits)
Sets whether digits are allowed in the output. |
void |
setLength(int i)
|
| Methods inherited from class de.unidu.is.text.AbstractFilter |
apply, apply |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public WordSplitterFilter(Filter nextFilter)
nextFilter - next filter in the filter chain
public WordSplitterFilter(Filter nextFilter,
int length)
nextFilter - next filter in the filter chain| Method Detail |
protected java.util.Iterator filter(java.lang.Object value)
This method splits a string into tokens. First, all non-letter characters are converted into whitespaces. Then, the resulting string is split into tokens (the whitespaces are the token boundaries). Only tokens with at least 3 characters are returned.
filter in class AbstractFiltervalue - value to be modified by this filter
AbstractFilter.filter(java.lang.Object)public int getLength()
public void setLength(int i)
i - public boolean isAllowDigits()
public void setAllowDigits(boolean allowDigits)
allowDigits - if true, digits are allowed in the outputprotected void handleBuffer(java.lang.StringBuffer buffer)
The current implementation replaces every non-letter character by a space.
buffer - string buffer to be handled
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||