Published: February 28, 2018
Integrated Framework for Keyword-based Text Data Collection and Analysis [PDF]
Minki Cha, Jung-Hyok Kwon, Sol-Bee Lee, Jaehoon Park, Sungkwan Youm, and Eui-Jik Kim
(Received August 1, 2017; Accepted November 2, 2017)
Keywords: data analysis, integrated framework, intelligent service, text data collection, web crawling
In this paper, we present an integrated framework for keyword-based text data collection and analysis. The integrated framework consists of four types of component: (1) user interface, (2) web crawler, (3) data analyzer, and (4) database (DB). The user interface is used to set input keyword and option values for web crawling and text data analysis through a graphical user interface (GUI). In addition, it provides analysis results through data visualization. The web crawler collects the text data of articles posted on the web based on input keywords. The data analyzer classifies the articles into “relevant articles” and “nonrelevant articles” using predefined knowledge (i.e., a set of words to be included in the articles). Then, it analyzes the text data of relevant articles and visualizes the results of the data analysis. Finally, the DB stores the collected text data, the predefined knowledge defined by the user, and the results of the data analysis and data visualization. We verify the feasibility of the integrated framework through proof of concept (PoC) prototyping. The experimental results show that the implemented prototype collects and analyzes the text data of articles reliably.
Corresponding author: Eui-Jik Kim
Cite this article
Minki Cha, Jung-Hyok Kwon, Sol-Bee Lee, Jaehoon Park, Sungkwan Youm, and Eui-Jik Kim, Integrated Framework for Keyword-based Text Data Collection and Analysis, Sens. Mater., Vol. 30, No. 3, 2018, p. 439-445.