摘要:Pharmaceutical drug candidate databases have reached massive sizes in recent years due to the improvement of benchside high throughput screening tools used by scientists. This rapid increase has caused a shift in the bottleneck in discovery and product development from the benchside to the computational side, thus creating a need for new computational tools that can facilitate the access and interpretation of such massive data. In this paper, a window-based compression technique that supports random database access is introduced. This technique improves random access to records in the database while maintaining high sequential throughput. The impact of the proposed compression technique is evaluated in the context of a non-indexed and an indexed database. The performance gain of the window-based compression technique is demonstrated using a drug candidate database which is used in the pharmaceutical drug discovery process.
关键词:drug candidate database; compression; random access