期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2020
卷号:11
期号:2
DOI:10.14569/IJACSA.2020.0110213
出版社:Science and Information Society (SAI)
摘要:Over the years of applying machine learning in bioinformatics, we have learned that scientists, working in many areas of life sciences, call for deeper knowledge of the modeled phenomenon than just the information used to classify the objects with a certain quality. As dynamic molecules of gene activities, transcriptome profiling by RNA sequencing (RNA-seq) is becoming increasingly popular, which not only measures gene expression but also structural variations such as mutations and fusion transcripts. Moreover, Single nucleotide polymorphisms (SNPs) are of great potential in genetics, breeding, ecological and evolutionary studies. Rough sets could be successfully employed to tackle various problems such as gene expression clustering and classification. This study provides general guidelines for accurate SNP discovery from RNA-seq data. Those SNPs annotations are used to find relation between their biological features and the differential expression of the genes to which those SNPs belong. Rough sets are utilized to define this kind of relationship into a finite set of rules. Set of (32) generated rules proved good results with strength, certainty and coverage evaluation terms. This strategy is applied to the analysis of SNPs in A. thaliana plant under heat-stress.