摘要:There are two common kinds of data formats to be adopted in data mining. One ishorizontal, and the other is vertical. Approaches based on vertical data formats havethe advantages of requiring a fewer number of database scans and computingitemset supports fast. One of the vertical data representations, bit vector, has recently been widely used for mining frequent item sets and has caused significant results. The sizes of bit vectors for item sets are, however, always the same, equal to the number of transactions in a database. In this paper, we propose the scheme of dynamic bit vectors to reduce the memory and the computational time for mining frequent item sets from transaction databases. A fast method for computing the intersection of twodynamic bit vectors and an algorithm for mining frequent item sets based on the scheme are presented. The proposed algorithm is also compared with some other approaches and experimental results show that it is quite efficient in both the mining time and the memory usage.
关键词:Data mining; frequent item set; dynamic bit vector; vertical data format