Topic Crawler for Social Networks Monitoring

Authors
Publication date 2013
Host editors
  • P. Klinov
  • D. Mouromtsev
Book title Knowledge Engineering and the Semantic Web
Book subtitle 4th International Conference, KESW 2013, St. Petersburg, Russia, October 7-9, 2013 : proceedings
ISBN
  • 9783642413599
ISBN (electronic)
  • 9783642413605
Series Communications in Computer and Information Science
Event 4th International Conference on Knowledge Engineering and Semantic Web, KESW 2013
Pages (from-to) 214-227
Number of pages 14
Publisher Heidelberg: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

Paper describes a focused crawler for monitoring social networks which is used for information extraction and content analysis. Crawler implements MapReduce model for distributed computations and is oriented to big text data. Focused crawler allows to look for the pages classified as relevant to the specified topic. Classifier is build using knowledge database that defines words, their classes and rules of joining words into the phrases. Based on the weights of words and phrases the text weight which indicates relevance to the topic is obtained. This system was used to detect drug community in Russian segment of Livejournal social network. Official and slang drug terminology was implemented to develop knowledge database. Different aspects of knowledge database and classifier are studied. The non-homogeneous Poisson process was used to model blogs changing since it permits to build a monitoring policy that includes blogs update frequency and day-time effect. Evaluation on real data shows 25% increase in new posts detection.

Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-642-41360-5_17
Other links https://www.scopus.com/pages/publications/84884640207
Permalink to this page
Back