摘要:We consider the problem of representing Boolean functions exactly by "sparse" linear combinations (over R) of functions from some "simple" class C. In particular, given C we are interested in finding low-complexity functions lacking sparse representations. When C forms a basis for the space of Boolean functions (e.g., the set of PARITY functions or the set of conjunctions) this sort of problem has a well-understood answer; the problem becomes interesting when C is "overcomplete" and the set of functions is not linearly independent. We focus on the cases where C is the set of linear threshold functions, the set of rectified linear units (ReLUs), and the set of low-degree polynomials over a finite field, all of which are well-studied in different contexts.
We provide generic tools for proving lower bounds on representations of this kind. Applying these, we give several new lower bounds for "semi-explicit" Boolean functions. Let alpha(n) be an unbounded function such that n^{alpha(n)} is time constructible (e.g. alpha(n) = log^*(n)). We show:
- Functions in NTIME[n^{alpha(n)}] that require super-polynomially many linear threshold functions to represent (depth-two neural networks with sign activation function, a special case of depth-two threshold circuit lower bounds).
- Functions in NTIME[n^{alpha(n)}] that require super-polynomially many ReLU gates to represent (depth-two neural networks with ReLU activation function).
- Functions in NTIME[n^{alpha(n)}] that require super-polynomially many O(1)-degree F_p-polynomials to represent exactly, for every prime p (related to problems regarding Higher-Order "Uncertainty Principles"). We also obtain a function in E^{NP} requiring 2^{Omega(n)} linear combinations.
- Functions in NTIME[n^{poly(log n)}] that require super-polynomially many ACC ° THR circuits to represent exactly (further generalizing the recent lower bounds of Murray and the author). We also obtain "fixed-polynomial" lower bounds for functions in NP, for the first three representation classes. All our lower bounds are obtained via algorithms for analyzing linear combinations of simple functions in the above scenarios, in ways which substantially beat exhaustive search.