Sep 06 2010
Veröffentlichungen
Konferenzen
Social Networks as Data Source for Recommendation Systems
Reviews and review based rankings are widely used in recommendation systems to provide potential customers quality information about selected products. During the last years, many researchers have shown that these reviews are neither objective nor do they represent real quality values. Even established ranking methods designed to fix this problem have been shown to be unreliable. In this work, user generated content of fora, weblogs and similar trustworthy social networks is proposed as an alternative data source. It is shown how this data can be used to calculate a satisfaction and relevance measure for different product features to provide potential customers reliable quality information. The method is evaluated in the automotive domain using J.D. Power’s established Initial Quality Study to ensure providing meaningful quality-related data.
-
[2010,inproceedings] bibtex
M. Bank and J. Franke, "Social Networks as Data Source for Recommendation Systems," in E-Commerce and Web Technologies, 2010, pp. 49-60.@inproceedings{Bank2010_2,
author = {Bank, Mathias and Franke, Juergen},
affiliation = {Daimler AG, Research & Development, D-89081 Ulm, Germany},
title = {Social Networks as Data Source for Recommendation Systems},
booktitle = {E-Commerce and Web Technologies},
series = {Lecture Notes in Business Information Processing},
editor = {Aalst, Will and Mylopoulos, John and Sadeh, Norman M. and Shaw, Michael J. and Szyperski, Clemens and Buccafurri, Francesco and Semeraro, Giovanni},
publisher = {Springer Berlin Heidelberg},
isbn = {978-3-642-15208-5},
keyword = {Economics/Management Science},
pages = {49-60},
volume = {61},
url = {http://dx.doi.org/10.1007/978-3-642-15208-5_5},
note = {10.1007/978-3-642-15208-5_5},
year = {2010}
}
Fuzzification of agglomerative hierarchical crisp clustering algorithms
User generated content from fora, weblogs and other social networks is a very fast growing data source in which different information extraction algorithms can provide a convenient data access. Hierarchical clustering algorithms are used to provide topics covered in this data on different levels of abstraction. During the last years, there has been some research using hierarchical fuzzy algorithms to handle
comments not dealing with one topic but many different topics at once. The used variants of the well-known fuzzy c-means algorithm are nondeterministic and thus the cluster results are irreproducible. In this work, we present a deterministic algorithm that fuzzifies currently available agglomerative hierarchical crisp clustering algorithms and therefore allows arbitrary multi assignments. It is shown how to reuse well-studied linkage metrics while the monotonic behavior is analyzed for each of them. The proposed algorithm is evaluated using collections of the RCV1 and RCV2 corpus.
[paper] [slides]
-
[2010,inproceedings] bibtexM. Bank and F. Schwenker, "Fuzzification of agglomerative hierarcical crisp clustering algorithms (to be published)," in Proceedings of the 34th Annual Conference of the German Classification Sociaty, 2010.
@inproceedings{Bank2010_1,
author={Mathias Bank and Friedhelm Schwenker},
title={Fuzzification of agglomerative hierarcical crisp clustering algorithms (to be published)},
booktitle={Proceedings of the 34th Annual Conference of the German Classification Sociaty},
year={2010}
}
Automatic User Comment Detection in Flat Internet Fora
Millions of people are using the World Wide Web and are publishing content online. This user generated content contains many information relevant not only to marketing but to companies in general (customer-oriented products), governments (direct democracy) and many more. Analysis on such data becomes more and more important. This paper deals with a prerequisite: we propose an algorithm to automatically detect posting structures in flat internet fora to extract user comments. The algorithm is able to handle a wide range of different fora systems — even nested structures. The approach first detects the main content section by applying a modified version of the SST algorithm and then detects the posting structure by using several posting properties found in internet fora. It creates XPath expressions for faster data extraction in further steps.
-
[2009,inproceedings] bibtexM. Bank and M. Mattes, "Automatic User Comment Detection in Flat Internet Fora," in Database and Expert Systems Applications, International Workshop on, 2009, pp. 373-377.
@inproceedings{Bank2009,
author = {Mathias Bank and Michael Mattes},
title = {Automatic User Comment Detection in Flat Internet Fora},
booktitle ={Database and Expert Systems Applications, International Workshop on},
volume = {0},
issn = {1529-4188},
year = {2009},
pages = {373--377},
doi = {http://doi.ieeecomputersociety.org/10.1109/DEXA.2009.14},
publisher = {IEEE Computer Society},
}
Workshops
Using the Internet as Sensor for Customer Perception
In a highly competitive market such as the automotive sector it is no longer sufficient to define quality solely on technical aspects. Rather, quality must be analyzed holistically from the perspective of the customer. The customer perception of a product decides the buying behavior. Therefore, Kano et al. have created a special survey form to distinguish between different quality categories for product features. It allows a cost-optimized product design, maximizing customer satisfaction. Practical usage has shown that applying a Kano survey is a very critical process. In this work an alternative data source is proposed that bypasses systematically the problems of a Kano survey: the Internet. Due to the increasing number of fora, weblogs and other social networks there is a large amount of unsolicitedly provided and mostly unbiased customer feedback available. This is used in a prototype architecture to obtain customer satisfaction for different product features and the corresponding influence on customer perception. Different abstraction levels provide information to improve this perception over time.
[paper] [no slides]
sonstige Präsentationen
Internet als Qualitätssensor – Oberseminar an der Universität Mainz (2009)
In Deutschland sind von 24,7 Mio. Internet-Nutzern 35% bereit, eigenen Inhalt im Internet zu veröffentlichen. 78% der Internet-Nutzer vertrauen diesen Daten beim Produktkauf. Um den Inhalt allerdings als Qualitätsindikator strukturiert nutzen zu können, sind eine Reihe von Analyseverfahren notwendig. Mathias Bank – Doktorand der Daimler Forschung in Ulm – zeigt im Rahmen des Oberseminars hierfür die wesentlichen Verarbeitungsschritte auf. Dabei wird insbesondere auf vorhandene Sentiment-Techniken und mögliche Themen-Detektionsverfahren eingegangen.
[Folien auf Anfrage]
