Papers
What is TFNP?
TFNP has been studied since the 90s [Pap94] as a framework to classify NP-intermediate problems: problems
which are not NP-hard but also not believed to be in P. Given that we believe quantum computers cannot solve NP-hard problems [BBBV97], NP-intermediate
problems are prime candidates for useful quantum advantage.
TFNP (Total Function NP) is the complexity class of NP search problems which always have a solution. It is a search problem, which means that you recieve an input x and you want to find one of many
possible y such that (x,y) is a valid input-solution pair. It is in NP which means that checking that (x,y) is valid can be done efficiently. Finally, it is total which means that given any x you are certain that there is some y
such that (x,y) is a valid input-solution pair.
Credit
Authors are listed in alphabetical order in last name for most papers, which is standard in mathematics and theoretical computer science. Papers marked with an asterisk (*) use contribution order,
which is standard in physics and applied areas of computer science.
Publications
On Pigeonhole Principles and Ramsey in TFNP
FOCS 2024
Abstract
We show that the TFNP problem RAMSEY is not black-box reducible to PIGEON, refuting a conjecture of Goldberg and Papadimitriou in the black-box setting.
We prove this by giving reductions to RAMSEY from a new family of TFNP problems that correspond to generalized versions of the pigeonhole principle, and then proving that these generalized versions cannot be reduced to PIGEON.
Formally, we define t-PPP as the class of total NP-search problems reducible to finding a t-collision in a mapping from (t-1)N+1 pigeons to N holes.
These classes are closely related to multi-collision resistant hash functions in cryptography. We show that the generalized pigeonhole classes form a hierarchy as t increases,
and also give a natural condition on the parameters t1, t2 that captures exactly when t1-PPP and t2-PPP collapse in the black-box setting.
Finally, we prove other inclusion and separation results between these generalized PIGEON problems and other previously studied TFNP subclasses, such as PLS, PPA and PLC.
Our separation results rely on new lower bounds in propositional proof complexity based on pseudoexpectation operators, which may be of independent interest.
Summary (for non-specialists)
This paper is motivated by the metamathematical question: is there a short (efficient) proof of a complicated theorem using a simple theorem? More specifically,
can Ramsey's theorem in extremal combinatorics be proved using the (simple) pigeonhole principle?
A celebrated theorem of Frank Ramsey says that in any undirected, simple graph on 2^n vertices, there must exist a clique
or independent set of size n/2. RAMSEY is the search problem of finding such a clique or independent set given a succinctly represented graph as input. Similarly, PIGEON is
the search problem corresponding to the pigeonhole principle. We show that there is no black-box reduction from RAMSEY to PIGEON: finding a clique or independent set does not reduce to finding a collision.
On the way to proving this, we define new problems corresponding to the generalized pigeonhole principle: if tn + 1 pigeons are put
into n holes, some hole must have t+1 pigeons. We prove a structure theorem about these problems and study its relation to cryptography.
Communication Complexity of Collision
RANDOM 2022
Abstract
The Collision problem is to decide whether a given list of n-bit numbers (x) is 1-to-1 or 2-to-1 when promised one of them is the case.
We show a polynomial randomised communication lower bound for the natural two-party version of Collision where Alice holds the first half of the bits of each x and Bob holds
the second half. As an application, we also show a similar lower bound for a weak bit-pigeonhole search problem, which answers a question of Itsykson and Riazanov (CCC 2021).
Follow-up
The exponent 1/12 in our lower bound was improved to a tight exponent of 1/4 by Yang & Zhang.
Summary (for non-specialists)
We study a natural problem in communication complexity where Alice and Bob want to check if their shared list of numbers has repeated elements (collisions) or not.
Our results are useful in cryptography.
In communication complexity, Alice and Bob recieve inputs x and y
and they want to compute a known function which depends on both x and y while communicating as few bits as possible. We study the problem where we have a list of n-bit numbers which are either all unique
or every number is repated twice. Alice recieves the first half of every number and Bob recieves the second half, and they want to decide which is the case. We prove the first non-trivial
lower bound for this problem, which has an application cryptography. In particular, it shows that it is possible to build a pair of hash functions for which it is hard to find a collision even if
each function in the pair can be accessed through a backdoor by some adversary. These are called Backdoored Random Oracles.
Separations in Proof Complexity and TFNP
Journal of the ACM, FOCS 2022
Abstract
It is well-known that Resolution proofs can be efficiently simulated by Sherali-Adams (SA) proofs.
We show, however, that any such simulation needs to exploit huge coefficients: Resolution cannot be efficiently simulated by SA when the coefficients are written in unary.
We also show that reversible Resolution cannot be efficiently simulated by Nullstellensatz (NS) over the reals.
These results have consequences for total NP search problems. First, we characterise the classes PPADS, PPAD, SOPL by unary-SA, unary-NS, and Reversible Resolution,
respectively. Second, we show that, relative to an oracle, PLS\not\subseteq PPP, SOPL\not\subseteq PPA, and EOPL\not\subseteq UEOPL. Together with prior work, this gives
a complete picture of the black-box relationships between the classical TFNP classes introduced in the 1990s.
Summary (for non-specialists)
Is every problem solvable by local search also solvable by collision-finding?
We prove that the answer to this question is no, within the formalization of TFNP.
To formalize this, PLS is the class of problems which can be solved using local search.
PPP is the class of problems
which can be solved by finding a collision in a compressing function. We show that there is no black-box reduction from PLS to PPP: there is no way to convert a reduction to local search into
a reduction to collision-finding. This was an open question since these classes were defined almost three decades ago. Our result goes through a connection to proof complexity, which studies how
long your proof needs to be if you're only allowed to use certain rules of inference.
Further Collapses in TFNP
SICOMP, CCC 2022, HALG 2022
Abstract
We show EOPL = PLS intersect PPAD. Here the class EOPL consists of all total search problems that reduce to the
End-of-Potential-Line problem, which was introduced in the works by Hubacek and Yogev (SODA 2017) and
Fearnley et al. (JCSS 2020). In particular, our result yields a new simpler proof of the breakthrough collapse CLS = PLS intersect PPAD
by Fearnley et al. (STOC 2021). We also prove a companion result SOPL = PLS intersect PPADS, where SOPL is the class associated
with the Sink-of-Potential-Line problem.
Summary (for non-specialists)
We show that the complexity class EOPL contains exactly the problems which can be solved efficiently using local search and by computing market equillibria.
Problems which can be solved using local search can be formalized to be in the complexity class PLS. Similarly,
PPAD is the class of problems which have a solution due to a parity argument, and famously the problem of computing a Nash Equillibirum is complete for this class. We show that
a problem called End-of-Potential-Line is complete for their intersection: problem A reduces to EOPL if and only if it lies in both PLS and PPAD. As a corollary of this work, we give a simple combinatorial proof of the fact the the class CLS
(problems which can be solved using gradient descent) is also equal to this intersection.
Unambiguous DNFs and Alon-Saks-Seymour
FOCS 2021 Invited to SICOMP Special Issue
Abstract
We exhibit an unambiguous k-DNF formula that requires CNF width almost k^2, which is optimal up to logarithmic factors.
As a consequence, we get a near-optimal solution to the Alon--Saks--Seymour problem in graph theory (posed in 1991), which asks: How large a gap can there
be between the chromatic number of a graph and its biclique partition number? Our result is also known to imply several other improved separations in
query and communication complexity.
Follow-up
Our graph-theoretic result was used to show that partial concept classes cannot be represented by total concept classes of small dimension in the work of Alon, Hanneke, Holzman & Moran.
Summary (for non-specialists)
We answer a long-standing open problem in graph theory via reductions to questions in query and communication complexity.
We do not know of a way to rephrase our proof in the language of graph theory or combinatorics, so this is an example of techniques in computer science being useful for pure math. For this reason, Noga Alon mentioned our work in his Knuth Prize lecture.
Alon--Saks--Seymour refers to a line of work in graph theory where the goal is to understand how much larger one property of the graph (chromatic number)
can be than another property (biclique partition number). Figuring out the exact relationship was known to be equivalent to figuring out the communication complexity of the "clique vs independent set" problem. In communication complexity, Alice and Bob recieve inputs x and y
and they want to compute a known function which depends on both x and y while communicating as few bits as possible. In this paper, we resolve the communication complexity of clique vs independent set when Alice and Bob recieve some additional help from a third-party prover,
which essentially answers the Alon--Saks--Seymour problem in combinatorics when considering arbitrary graphs.
Manuscripts
Quantum Communication Advantage in TFNP
Preprint
Abstract
We exhibit a total search problem whose communication complexity in the quantum SMP (simultaneous message passing) model is exponentially smaller than in the classical two-way randomized model.
Moreover, the quantum protocol is computationally efficient and its solutions are classically verifiable, that is, the problem lies in the communication analogue of the class TFNP. Our problem is a bipartite version of a query
complexity problem recently introduced by Yamakawa and Zhandry (JACM 2024). We prove the classical lower bound using the structure-vs-randomness paradigm for analyzing communication protocols.
Summary (for non-specialists)
We demonstrate a communication problem which can be solved in the quantum setting by using a few hundred qubits. Yet we prove that classically, any protocol must communicate
more bits than there are atoms in the observable universe.
In communication complexity, Alice and Bob recieve inputs x and y and they want to compute a known function which depends on both x and y while communicating as few (qu)bits as possible. Simulatenous message passing (SMP) is a very restricted form of
communication where they are not allowed to interact with each other at all and just send a single message to a referee who computes the answer. Our protocol works in the quanutm SMP model, and we prove a lower bound in the vanilla model of classical communication
where there are no restrictions placed on Alice and Bob. A crucial feature of our result is also that our problem is total: any string of the required length is allowed as input. This differs from most instances of quantum advantage that we know, since we often need to
promise that our input comes from special sets which allow the quantum computer to solve it efficiently.
Consumable Data via Quantum Communication
ICML 2024 Workshop on Agentic Markets
Abstract
Classical data can be copied and re-used for computation, with adverse consequences economically and in terms of data privacy. Motivated by this, we formulate problems in one-way
communication complexity where Alice holds some data and Bob holds m inputs, and he wants to compute m instances of a bipartite relation on Alice's data and each of his inputs. We call this the asymmetric direct
sum question for one-way communication. We give a number of examples where the quantum communication complexity of such problems scales polynomially with m, while the classical communication complexity depends at
most logarithmically on m. For these examples, data behaves like a consumable resource when the owner stores and transmits it as quantum states. We show an application to a strategic data-selling game, and discuss
other potential economic implications.
On the Rational Degree of Boolean Functions and Applications
Manuscript
Abstract
We study a natural complexity measure of Boolean functions known as the (exact) rational degree. For total functions f, it is conjectured that rdeg(f) is polynomially related to deg(f), where deg(f) is the Fourier degree. Towards this conjecture, we show that symmetric functions have rational degree at least deg(f)/2 and monotone functions have rational degree at least \sqrt{deg(f)}. We observe that both of these lower bounds are tight. In addition, we show that all read-once depth-d Boolean formulae have rational degree at least Ω(deg(f)1/d). Furthermore, we show that almost every Boolean function on n variables has rational degree at least n/2 - O(\sqrt{n}).
In contrast to total functions, we exhibit partial functions that witness unbounded separations between rational and approximate degree, in both directions. As a consequence, we show that for quantum computers, post-selection and bounded-error are incomparable resources in the black-box model.