|
|  |
 |
 |
XML Clustering and Browsing - Notes
|
 |
Notes
Semantics of XML markup, slide 3
- Order of elements:
Not so much about structure (order of element tags), but
more about order of elements of the same type.
E.g. the frames in an MPEG video: there, the order is
relevant.
Similarity: nesting, slide 5
- Consider the structure only, i.e. only the nesting as
such. Do not look at the element names here.
Similarity: coordination, slide 6
- Treat (see) coordinated elements more or less like
tuples.
Similarity: importance, slide 9
- There is no theoretic foundation for the weighted
sum.
- By modeling importance from the user's point of view, we
would arrive at a product (rather than a sum), with the
weights as powers of the factors.
- Once upon a time, Norbert already experimented with
learning such weights via regression techniques.
General considerations, slide 12
- A clear partition will not always be the most
meaningful one. Thus, we should not state that as a
goal, but rather: finding partitions with semantic
interpretations.
Clustering and perspectives, slide 17
- Order:
E.g. with the list of ingredients for food, the order of
elements is relevant, and thus should be considered for
clustering, too.
- Position:
With the papers from certain fields of research, the order
of the authors is relevant. A user might want to cluster
with the first author, only, for example.
Others
- We did not consider links at all.
|
 |
 |
|  |  |