摘要:A recent model for property testing of probability distributions (Chakraborty et
al., ITCS 2013, Canonne et al., SICOMP 2015) enables tremendous savings in the sample
complexity of testing algorithms, by allowing them to condition the sampling on subsets of
the domain. In particular, Canonne, Ron, and Servedio (SICOMP 2015) showed that, in this
setting, testing identity of an unknown distribution D (i. e., whether D = D
∗
for an explicitly
known D
∗
) can be done with a constant number of queries (i. e., samples), independent of the
support size n—in contrast to the required Ω(
√
n) in the standard sampling model. However,
it was unclear whether the same stark contrast exists for the case of testing equivalence,
where both distributions are unknown. Indeed, while Canonne et al. established a poly(logn)-
query upper bound for equivalence testing, very recently brought down to O˜(loglogn) by
Falahatgar et al. (COLT 2015), whether a dependence on the domain size n is necessary was still open, and explicitly posed by Fischer at the Bertinoro Workshop on Sublinear Algorithms
(2014). In this article, we answer the question in the affirmative, showing that any testing
algorithm for equivalence must make Ω
√
loglogn
queries in the conditional sampling
model. Interestingly, this demonstrates a gap between identity and equivalence testing, absent
in the standard sampling model (where both problems have sampling complexity n
Θ(1)
).
关键词:property testing; distribution testing; conditional sampling; lower bounds;;
equivalence testing; uniformity testing; support size estimation