期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2022
卷号:13
期号:3
DOI:10.14569/IJACSA.2022.0130354
语种:English
出版社:Science and Information Society (SAI)
摘要:The problem in the data storage method that can support the data processing speed in the network is one of the key problems in big data. As computing speed increases and cluster size increases, I/O and network processes related to intensive data usage cannot keep up with the growth rate and data processing speed. Data processing applications will experience latency issues from long I/O. Distributed data storage systems can use Web scale technology to assist centralized data storage in a computing environment to meet the needs of data science. By analyzing several distributed data storage models, namely NFS, GlusterFS and MooseFS, a distributed data storage method is proposed. The parameters used in this study are transfer rate, IOPS and CPU resource usage. Through testing the sequential and random reading and writing of data, it is found that GlusterFS has faster performance and the best performance for sequential and random data reading when using 64k block data storage. MooseFS uses 64k power storage blocks to obtain the best performance in random data read operations. Using 32k data storage blocks, NFS achieves the best results in random writes. The performance of a distributed data storage system may be affected by the size of the data storage block. Using a larger data storage block can achieve faster performance in data transmission and performing operations on data.