본문 바로가기

쓰기

 
저자 Jinhee Park, Jaekwang Kim, and Jee-Hyong Lee 
논문지 Journal of Information Science 
Vol.  
No.  
pp.  
게재일 2013-10-24 

Abstract

In this paper, a method is proposed to extract topic keywords of blogs, based on the richness of content. If a blog includes rich content related to a topic word, the word can be considered as a keyword of the blog. For this purpose, a new measure, richness, is proposed, which indicates how much a blog covers the trendy subtopics of a keyword. In order to obtain trendy subtopics of keywords, we use outside topical context data – the web. Since the web includes various and trendy information, we can find popular and trendy content related to a topic. For each candidate keyword, a set of web documents is retrieved by Google, and the subtopics found in the web documents are modelled by a probabilistic approach. Based on the subtopic models, the proposed method evaluates the richness of blogs for candidate keywords, in terms of how much a blog covers the trendy subtopics of keywords. If a blog includes various contents on a word, the word needs to be chosen as one of the keywords of the blog. In the experiments, the proposed method is compared with various methods, and shows better results, in terms of hit count, trendiness and consistency.

    2016

    2015

    2014

    2013

    2012