Short text classification in twitter to improve information filtering, B. Sriram and D. Fuhry and E. Demir and H. Ferhatosmanoglu,2010
This paper describes research that classifies tweets using a reduced set of features. In this approach they try to classify text into the following set of classes "News, Events, Opinions, Deals, and Private Messages". The problem they present is the curse of dimensionality problem that results from trying to conquer the spareness issue related to classifying twitter messages. Other research typically uses external knowledge bases to support tweet classification. They argue this can be slow due to the need to excessively query the external knowledge base.
Important points about this paper:
1. They provide a very useful discussion of Twitter and tweets
2. How they classify tweets is interesting worth another review
No comments:
Post a Comment