期刊名称:International Journal of Intelligence Science
印刷版ISSN:2163-0283
电子版ISSN:2163-0356
出版年度:2017
卷号:07
期号:01
页码:9-23
DOI:10.4236/ijis.2017.71002
语种:English
出版社:Scientific Research Publishing
摘要:This paper aims to propose the sequential pattern discovery method of Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer disease. The DNA which is composed of amino acids of gene P53 is mutated. It effects to change of P53 formation. Sequential pattern discovery is a process of extracting data to generate knowledge about the series of events that has the sequences in a certain frequency so that it creates a pattern. PrefixSpan is to propose method to find a pattern of DNA sequence database. As a result, there are various selected patterns of DNA sequence. The pattem which has high similarity is used as biomarker to identify the breast cancer disease. The performance measure of support value average is 0.8. It means that the frequent sequence pattern is high. Another measure is confidence. All of the confidence values are 1. Then, the last performance measure is lift ratio at average more than 1. It means that the composed sequence items in the pattern has high dependency and relatedness. Futhermore, the selected patterns are applied as biomarker with accuracy as 100%.
关键词:Sequential Pattern;Breast Cancer;DNA;PrefixSpan;Lift Ratio