de.unidu.is.text
Class HTMLFilter
java.lang.Object
de.unidu.is.text.AbstractFilter
de.unidu.is.text.AbstractSingleItemFilter
de.unidu.is.text.HTMLFilter
- All Implemented Interfaces:
- Filter, SingleItemFilter
- public class HTMLFilter
- extends AbstractSingleItemFilter
This filter extracts all text from a specified HTML string, and returns
the text content in a single string.
- Since:
- 2003-07-04
- Version:
- $Revision: 1.9 $, $Date: 2005/03/09 08:59:15 $
- Author:
- Henrik Nottelmann
|
Constructor Summary |
HTMLFilter(Filter nextFilter)
Creates a new instance and sets the next filter in the chain. |
|
Method Summary |
java.lang.Object |
run(java.lang.Object value)
Extracts the text from HTML, removes all tags, replaces
well-known entities and removes the rest of them, and returns a single
strring. |
protected boolean |
substringStartsWith(java.lang.StringBuffer buffer,
int index,
java.lang.String str)
Tests whether the specified string buffer starts (from the
specified index) with the specified string (where all letters
are converted to lowercase). |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HTMLFilter
public HTMLFilter(Filter nextFilter)
- Creates a new instance and sets the next filter in the chain.
- Parameters:
nextFilter - next filter in the filter chain
run
public java.lang.Object run(java.lang.Object value)
- Extracts the text from HTML, removes all tags, replaces
well-known entities and removes the rest of them, and returns a single
strring.
- Parameters:
value - HTML string
- Returns:
- text content of the HTML string
substringStartsWith
protected boolean substringStartsWith(java.lang.StringBuffer buffer,
int index,
java.lang.String str)
- Tests whether the specified string buffer starts (from the
specified index) with the specified string (where all letters
are converted to lowercase).
- Parameters:
buffer - string buffer to testedindex - starting indexstr - string to test
- Returns:
- true if the specified string buffer starts (from the
specified index) with the specified string