摘要:The shift distance sh(Sâ,,Sâ,,) between two strings Sâ, and Sâ,, of the same length is defined as the minimum Hamming distance between Sâ, and any rotation (cyclic shift) of Sâ,,. We study the problem of sketching the shift distance, which is the following communication complexity problem: Strings Sâ, and Sâ,, of length n are given to two identical players (encoders), who independently compute sketches (summaries) sk(Sâ,) and sk(Sâ,,), respectively, so that upon receiving the two sketches, a third player (decoder) is able to compute (or approximate) sh(Sâ,,Sâ,,) with high probability. This paper primarily focuses on the more general k-mismatch version of the problem, where the decoder is allowed to declare a failure if sh(Sâ,,Sâ,,) > k, where k is a parameter known to all parties. Andoni et al. (STOC'13) introduced exact circular k-mismatch sketches of size OÌf(k+D(n)), where D(n) is the number of divisors of n. Andoni et al. also showed that their sketch size is optimal in the class of linear homomorphic sketches. We circumvent this lower bound by designing a (non-linear) exact circular k-mismatch sketch of size OÌf(k); this size matches communication-complexity lower bounds. We also design (1± ε)-approximate circular k-mismatch sketch of size OÌf(min(ε^{-2}â^Sk, ε^{-1.5}â^Sn)), which improves upon an OÌf(ε^{-2}â^Sn)-size sketch of Crouch and McGregor (APPROX'11).
关键词:Hamming distance; k-mismatch; sketches; rotation; cyclic shift; communication complexity