期刊名称:Bulletin of the Technical Committee on Data Engineering
出版年度:2014
卷号:37
期号:2
出版社:IEEE Computer Society
摘要:Big data real-time processing aims for faster retrieval of data and analysis. Lately, in order to accel-erate real-time processing, big data platforms are trying to exploit NAND flash based storage devices,especially SSDs. NoSQL DBMSs have been used for real-time management of big data which signif-icantly depends on index structures to efficiently manage data. Previous research about flash-awareindex structures addressed the potential problems of hard-disk oriented designs. In this paper, we focuson exploiting potential benefits of flash SSDs. First, we examine the internal parallelism of flash SSDs bybenchmarking several flash SSDs. Then we present a new I/O request concept, called psync I/O, that canexploit the internal parallelism of flash SSDs in a single process, and we propose a new search method(MPSearch) that enables tree based indexs to exploit the internal parallelism of flash SSDs. Based onMPSearch, we present a B+-tree variant, PIO B-tree (Parallel I/O B-tree). PIO B-tree enhanced B+-trees insert performance by a factor of up to 16.3, while improving point-search performance by a factorof 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover,PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. In order to enhanceNoSQL DBMS performance on flash SSDs, PIO B-tree can be adopted or MPSearch can be applied toother tree-based index structures adopted in NoSQL DBMSs.