期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2011
卷号:11
期号:6
页码:62-68
出版社:International Journal of Computer Science and Network Security
摘要:Web usage mining has been widely adopted in various fields such as optimizing site structure, user-behavior analysis, personalized web services and system performance tuning. Although much research has been done against web log mining algorithms and log pre-processing techniques, the study of efficient retrieval of the structured contents for web log mining is seldom reported. In this paper, we first show that people are much more interested in discovering user navigation based on various path-sources. Then, we present a novel session identification algorithm Referrer Link based on discovering linked referrers to serve source-oriented path mining. Next, an efficient web log indexing and path extracting technique is introduced to provide structured web log data for general purpose log mining. The experimental results has shown that the accuracy of the mining results conducted against the sessions discovered by the proposed Referrer Link algorithm is 10% higher in average compared with Time-out approach.
关键词:Session identification; web log mining; path extraction; log indexing