Mai 28 2012

Fuzzification of agglomerative hierarchical crisp clustering algorithms

Veröffentlicht by

User generated content from fora, weblogs and other social networks is a very fast growing data source in which different information extraction algorithms can provide a convenient data access. Hierarchical clustering algorithms are used to provide topics covered in this data on different levels of abstraction. During the last years, there has been some research using hierarchical fuzzy algorithms to handle
comments not dealing with one topic but many different topics at once. The used variants of the well-known fuzzy c-means algorithm are nondeterministic and thus the cluster results are irreproducible. In this work, we present a deterministic algorithm that fuzzifies currently available agglomerative hierarchical crisp clustering algorithms and therefore allows arbitrary multi assignments. It is shown how to reuse well-studied linkage metrics while the monotonic behavior is analyzed for each of them. The proposed algorithm is evaluated using collections of the RCV1 and RCV2 corpus.

[paper] [slides]

  • [2010,inproceedings] bibtex
    M. Bank and F. Schwenker, "Fuzzification of agglomerative hierarcical crisp clustering algorithms (to be published)," in Proceedings of the 34th Annual Conference of the German Classification Sociaty, 2010.
    @inproceedings{Bank2010_1,
      author={Mathias Bank and Friedhelm Schwenker},
      title={Fuzzification of agglomerative hierarcical crisp clustering algorithms (to be published)},
      booktitle={Proceedings of the 34th Annual Conference of the German Classification Sociaty},
      year={2010}
    }

No responses yet

Trackback URI | Comments RSS

Hinterlasse eine Antwort