首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:BioFactHMM: MULTIDIMENSIONAL MODELING OF BIOLOGICAL DATA FROM HIDDEN MARKOV MODEL GENERATED DATASETS
  • 本地全文:下载
  • 作者:Manas Ranjan Pradhan ; Beenu Mago ; Deepak Kalra
  • 期刊名称:Indian Journal of Computer Science and Engineering
  • 印刷版ISSN:2231-3850
  • 电子版ISSN:0976-5166
  • 出版年度:2020
  • 卷号:11
  • 期号:4
  • 页码:383-393
  • DOI:10.21817/indjcse/2020/v11i4/201104264
  • 出版社:Engg Journals Publications
  • 摘要:The ever growing biological research generates large volumes of biological data and knowledge bases ranging from clinical test results to genome analysis. The dynamic changes of genome sequences and complexity of these database and their relations have given lot of challenges to data analysis. There are many online databases are available for biological studies. It is essential that biological data can be analyzed in multidimensional way creating data warehouse and then online analytical processing. The method of multidimensional modeling, star schema is not sufficient for biological data as it cannot cater more relationships. The Snowflake schema though helpful in better relations among datasets than star schema but cannot model all data from all databases specially the hidden states of long new biological sequences or complex medical data. Looking at above scenario, the idea mentioned in this paper combined the efforts of generating datasets by HMM (Hidden Markov Model) from all types biological databases available online and use Fact Constellation schema of data warehouse modeling. Hidden Markov Model has adopted in this study to find newly datasets and help in analyzing relations between these datasets. Once the data sets generated the fact constellation schema of multidimensional modeling done for making data warehouse. Henceforth new proposed model in this work is called BioFactHMM schema specially proposed for biological data which is a mix of star and snowflake schema. This model desires to capture all semantics of bio sequence from various data sources using HMM. Then data warehouse modeling is done with design principles of Fact constellation schema. Subsequently, the analysis technique of OLAP cube is done to view the data and reports in a multidimensional way.
  • 关键词:HMM;Multidimensional;Genome data;Biological data;Data Warehouse;Data Modeling;Fact Constellation;Biological databases
国家哲学社会科学文献中心版权所有