首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Multi-pattern Matching with Wildcards
  • 本地全文:下载
  • 作者:Zhang, Meng ; Zhang, Yi ; Tang, Jijun
  • 期刊名称:Journal of Software
  • 印刷版ISSN:1796-217X
  • 出版年度:2011
  • 卷号:6
  • 期号:12
  • 页码:2391-2398
  • DOI:10.4304/jsw.6.12.2391-2398
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Multi-pattern matching with wildcards is to find all the occurrences of a set of patterns with wildcards in a text. This problem arises in various fields, such as computational biology and network security. But the problem is not extensively studied as the single pattern case and there is no efficient algorithm for this problem. In this paper, we present efficient algorithms based on the fast Fourier transform. Let $P=\{p^1,\ldots,p^k\}$ be a set of patterns with wildcards where the total length of patterns is $|P|$, and a text $t$ of length $n$ over alphabet $a_1,\ldots,a_{\sigma}$. We present three algorithms for this problem where patterns are matched simultaneously. The first algorithm finds the matches of a small set of patterns in the text in $O(n\log |P|+occ\log k)$ time where $occ$ is the total number of occurrences of $P$ in $t$. The words used in the algorithm are of size $k\lceil2\lg\sigma\rceil+\sum_{i=1}^k \lceil\lg|p^i|\rceil$ bits. The second algorithm is based on a prime number encoding. It runs in time $O(n\log m+occ\log k)$ where $m$ is the length of the longest pattern in $P$. The algorithm uses words with $k\lceil \lg(2m\sigma^2+k^2)\rceil$ bits. The third one finds the occurrences of patterns in the text in time $O(n\log |P|\log\sigma+occ\log k)$ by computing the Hamming distance between patterns and the text. The algorithm uses words with $\sum_{i=1}^k \lceil\lg|p^i|\rceil$ bits. Moreover, we demonstrate an FFT implementation based on the modular arithmetic for machines with 64-bit word. Finally, we show that these algorithms can be easily parallelized, and the parallelized algorithms are given as well.
  • 关键词:algorithm;multi-pattern matching;wildcards;FFT
国家哲学社会科学文献中心版权所有