Siddhartha Jain

Gates-Dell Complex, 4.410E · Austin, TX 78712
[email protected]

Hiya! I'm a PhD student at UT Austin, fortunate to be advised by Scott Aaronson. Previously, I was a MSc student at EPFL and lucky to be mentored by Mika Göös.


Papers

What is TFNP?TFNP has been studied since the 90s [Pap94] as a framework to classify NP-intermediate problems: problems which are not NP-hard but also not believed to be in P. Given that we believe quantum computers cannot solve NP-hard problems [BBBV97], NP-intermediate problems are prime candidates for useful quantum advantage.
TFNP (Total Function NP) is the complexity class of NP search problems which always have a solution. It is a search problem, which means that you recieve an input x and you want to find one of many possible y such that (x,y) is a valid input-solution pair. It is in NP which means that checking that (x,y) is valid can be done efficiently. Finally, it is total which means that given any x you are certain that there is some y such that (x,y) is a valid input-solution pair.
CreditAuthors are listed in alphabetical order in last name for most papers, which is standard in mathematics and theoretical computer science. Papers marked with an asterisk (*) use contribution order, which is standard in physics and applied areas of computer science.

Quantum Communication Advantage in TFNP

Mika Göös, Tom Gur, Siddhartha Jain, Jiawei Li
QIP 2025
Abstract We exhibit a total search problem whose communication complexity in the quantum SMP (simultaneous message passing) model is exponentially smaller than in the classical two-way randomized model. Moreover, the quantum protocol is computationally efficient and its solutions are classically verifiable, that is, the problem lies in the communication analogue of the class TFNP. Our problem is a bipartite version of a query complexity problem recently introduced by Yamakawa and Zhandry (JACM 2024). We prove the classical lower bound using the structure-vs-randomness paradigm for analyzing communication protocols.
Summary (for non-specialists) We demonstrate a communication problem which can be solved in the quantum setting by using a few hundred qubits. Yet we prove that classically, any protocol must communicate more bits than there are atoms in the observable universe.
In communication complexity, Alice and Bob recieve inputs x and y and they want to compute a known function which depends on both x and y while communicating as few (qu)bits as possible. Simulatenous message passing (SMP) is a very restricted form of communication where they are not allowed to interact with each other at all and just send a single message to a referee who computes the answer. Our protocol works in the quanutm SMP model, and we prove a lower bound in the vanilla model of classical communication where there are no restrictions placed on Alice and Bob. A crucial feature of our result is also that our problem is total: any string of the required length is allowed as input. This differs from most instances of quantum advantage that we know, since we often need to promise that our input comes from special sets which allow the quantum computer to solve it efficiently.

Consumable Data via Quantum Communication

*Dar Gilboa, Siddhartha Jain, Jarrod McClean
ICML 2024 Workshop on Agentic Markets
Abstract Classical data can be copied and re-used for computation, with adverse consequences economically and in terms of data privacy. Motivated by this, we formulate problems in one-way communication complexity where Alice holds some data and Bob holds m inputs, and he wants to compute m instances of a bipartite relation on Alice's data and each of his inputs. We call this the asymmetric direct sum question for one-way communication. We give a number of examples where the quantum communication complexity of such problems scales polynomially with m, while the classical communication complexity depends at most logarithmically on m. For these examples, data behaves like a consumable resource when the owner stores and transmits it as quantum states. We show an application to a strategic data-selling game, and discuss other potential economic implications.

On Pigeonhole Principles and Ramsey in TFNP

FOCS 2024
Abstract We show that the TFNP problem RAMSEY is not black-box reducible to PIGEON, refuting a conjecture of Goldberg and Papadimitriou in the black-box setting. We prove this by giving reductions to RAMSEY from a new family of TFNP problems that correspond to generalized versions of the pigeonhole principle, and then proving that these generalized versions cannot be reduced to PIGEON. Formally, we define t-PPP as the class of total NP-search problems reducible to finding a t-collision in a mapping from (t-1)N+1 pigeons to N holes. These classes are closely related to multi-collision resistant hash functions in cryptography. We show that the generalized pigeonhole classes form a hierarchy as t increases, and also give a natural condition on the parameters t1, t2 that captures exactly when t1-PPP and t2-PPP collapse in the black-box setting. Finally, we prove other inclusion and separation results between these generalized PIGEON problems and other previously studied TFNP subclasses, such as PLS, PPA and PLC. Our separation results rely on new lower bounds in propositional proof complexity based on pseudoexpectation operators, which may be of independent interest.
Summary (for non-specialists)This paper is motivated by the metamathematical question: is there a short (efficient) proof of a complicated theorem using a simple theorem? More specifically, can Ramsey's theorem in extremal combinatorics be proved using the (simple) pigeonhole principle?
A celebrated theorem of Frank Ramsey says that in any undirected, simple graph on 2^n vertices, there must exist a clique or independent set of size n/2. RAMSEY is the search problem of finding such a clique or independent set given a succinctly represented graph as input. Similarly, PIGEON is the search problem corresponding to the pigeonhole principle. We show that there is no black-box reduction from RAMSEY to PIGEON: finding a clique or independent set does not reduce to finding a collision. On the way to proving this, we define new problems corresponding to the generalized pigeonhole principle: if tn + 1 pigeons are put into n holes, some hole must have t+1 pigeons. We prove a structure theorem about these problems and study its relation to cryptography.

On the Rational Degree of Boolean Functions and Applications

Manuscript
Abstract We study a natural complexity measure of Boolean functions known as the (exact) rational degree. For total functions f, it is conjectured that rdeg(f) is polynomially related to deg(f), where deg(f) is the Fourier degree. Towards this conjecture, we show that symmetric functions have rational degree at least deg(f)/2 and monotone functions have rational degree at least \sqrt{deg(f)}. We observe that both of these lower bounds are tight. In addition, we show that all read-once depth-d Boolean formulae have rational degree at least Ω(deg(f)1/d). Furthermore, we show that almost every Boolean function on n variables has rational degree at least n/2 - O(\sqrt{n}). In contrast to total functions, we exhibit partial functions that witness unbounded separations between rational and approximate degree, in both directions. As a consequence, we show that for quantum computers, post-selection and bounded-error are incomparable resources in the black-box model.

Communication Complexity of Collision

Mika Göös, Siddhartha Jain
RANDOM 2022
Abstract The Collision problem is to decide whether a given list of n-bit numbers (x) is 1-to-1 or 2-to-1 when promised one of them is the case. We show a polynomial randomised communication lower bound for the natural two-party version of Collision where Alice holds the first half of the bits of each x and Bob holds the second half. As an application, we also show a similar lower bound for a weak bit-pigeonhole search problem, which answers a question of Itsykson and Riazanov (CCC 2021).
Follow-up The exponent 1/12 in our lower bound was improved to a tight exponent of 1/4 by Yang & Zhang.
Summary (for non-specialists)We study a natural problem in communication complexity where Alice and Bob want to check if their shared list of numbers has repeated elements (collisions) or not. Our results are useful in cryptography.
In communication complexity, Alice and Bob recieve inputs x and y and they want to compute a known function which depends on both x and y while communicating as few bits as possible. We study the problem where we have a list of n-bit numbers which are either all unique or every number is repated twice. Alice recieves the first half of every number and Bob recieves the second half, and they want to decide which is the case. We prove the first non-trivial lower bound for this problem, which has an application cryptography. In particular, it shows that it is possible to build a pair of hash functions for which it is hard to find a collision even if each function in the pair can be accessed through a backdoor by some adversary. These are called Backdoored Random Oracles.

Separations in Proof Complexity and TFNP

Journal of the ACM, FOCS 2022
Abstract It is well-known that Resolution proofs can be efficiently simulated by Sherali-Adams (SA) proofs. We show, however, that any such simulation needs to exploit huge coefficients: Resolution cannot be efficiently simulated by SA when the coefficients are written in unary. We also show that reversible Resolution cannot be efficiently simulated by Nullstellensatz (NS) over the reals.
These results have consequences for total NP search problems. First, we characterise the classes PPADS, PPAD, SOPL by unary-SA, unary-NS, and Reversible Resolution, respectively. Second, we show that, relative to an oracle, PLS\not\subseteq PPP, SOPL\not\subseteq PPA, and EOPL\not\subseteq UEOPL. Together with prior work, this gives a complete picture of the black-box relationships between the classical TFNP classes introduced in the 1990s.
Summary (for non-specialists)Is every problem solvable by local search also solvable by collision-finding? We prove that the answer to this question is no, within the formalization of TFNP.
To formalize this, PLS is the class of problems which can be solved using local search. PPP is the class of problems which can be solved by finding a collision in a compressing function. We show that there is no black-box reduction from PLS to PPP: there is no way to convert a reduction to local search into a reduction to collision-finding. This was an open question since these classes were defined almost three decades ago. Our result goes through a connection to proof complexity, which studies how long your proof needs to be if you're only allowed to use certain rules of inference.

Further Collapses in TFNP

SICOMP, CCC 2022, HALG 2022
Abstract We show EOPL = PLS intersect PPAD. Here the class EOPL consists of all total search problems that reduce to the End-of-Potential-Line problem, which was introduced in the works by Hubacek and Yogev (SODA 2017) and Fearnley et al. (JCSS 2020). In particular, our result yields a new simpler proof of the breakthrough collapse CLS = PLS intersect PPAD by Fearnley et al. (STOC 2021). We also prove a companion result SOPL = PLS intersect PPADS, where SOPL is the class associated with the Sink-of-Potential-Line problem.
Summary (for non-specialists) We show that the complexity class EOPL contains exactly the problems which can be solved efficiently using local search and by computing market equillibria.
Problems which can be solved using local search can be formalized to be in the complexity class PLS. Similarly, PPAD is the class of problems which have a solution due to a parity argument, and famously the problem of computing a Nash Equillibirum is complete for this class. We show that a problem called End-of-Potential-Line is complete for their intersection: problem A reduces to EOPL if and only if it lies in both PLS and PPAD. As a corollary of this work, we give a simple combinatorial proof of the fact the the class CLS (problems which can be solved using gradient descent) is also equal to this intersection.

Unambiguous DNFs and Alon-Saks-Seymour

FOCS 2021 Invited to SICOMP Special Issue
Abstract We exhibit an unambiguous k-DNF formula that requires CNF width almost k^2, which is optimal up to logarithmic factors. As a consequence, we get a near-optimal solution to the Alon--Saks--Seymour problem in graph theory (posed in 1991), which asks: How large a gap can there be between the chromatic number of a graph and its biclique partition number? Our result is also known to imply several other improved separations in query and communication complexity.
Follow-upOur graph-theoretic result was used to show that partial concept classes cannot be represented by total concept classes of small dimension in the work of Alon, Hanneke, Holzman & Moran.
Summary (for non-specialists)We answer a long-standing open problem in graph theory via reductions to questions in query and communication complexity. We do not know of a way to rephrase our proof in the language of graph theory or combinatorics, so this is an example of techniques in computer science being useful for pure math. For this reason, Noga Alon mentioned our work in his Knuth Prize lecture.
Alon--Saks--Seymour refers to a line of work in graph theory where the goal is to understand how much larger one property of the graph (chromatic number) can be than another property (biclique partition number). Figuring out the exact relationship was known to be equivalent to figuring out the communication complexity of the "clique vs independent set" problem. In communication complexity, Alice and Bob recieve inputs x and y and they want to compute a known function which depends on both x and y while communicating as few bits as possible. In this paper, we resolve the communication complexity of clique vs independent set when Alice and Bob recieve some additional help from a third-party prover, which essentially answers the Alon--Saks--Seymour problem in combinatorics when considering arbitrary graphs.

Education

UT Austin

Doctor of Philosophy
Computer Science
2022 - Present

EPFL

Master of Science
Computer Science
2020 - 2022

IIIT-Delhi

Bachelor of Technology
Computer Science & Applied Mathematics
  • Co-founded the Math club (Évariste)
2016 - 2020

-->