The problem of finding large cliques in random graphs and its ``planted" variant, where one wants to recover a clique of size $\omega \gg \log{(n)}$ added to an \Erdos-\Renyi graph $G \sim G(n,\frac{1}{2})$, have been intensely studied. Nevertheless, existing polynomial time algorithms can only recover planted cliques of size $\omega = \Omega(\sqrt{n})$. By contrast, information theoretically, one can recover planted cliques so long as $\omega \gg \log{(n)}$. In this work, we continue the investigation of algorithms from the sum of squares hierarchy for solving the planted clique problem begun by Meka, Potechin, and Wigderson \cite{MPW15} and Deshpande and Montanari \cite{DM15}. Our main result is that degree four SoS does not recover the planted clique unless $\omega \gg \sqrt n / \polylog n$, improving upon the bound $\omega \gg n^{1/3}$ due to \cite{DM15}. An argument of Kelner shows that the this result cannot be proved using the same certificate as prior works. Rather, our proof involves constructing and analyzing a new certificate that yields the nearly tight lower bound by ``correcting" the certificate of \cite{MPW15,DM15,FeigeK03}.

The Firefighter problem and a variant of it, known as Resource Minimization for Fire Containment (RMFC), are natural models for optimal inhibition of harmful spreading processes. Despite considerable progress on several fronts, the approximability of these problems is still badly understood. This is the case even when the underlying graph is a tree, which is one of the most-studied graph structures in this context and the focus of this paper. In their simplest version, a fire spreads from one fixed vertex step by step from burning to adjacent non-burning vertices, and at each time step B many non-burning vertices can be protected from catching fire. The Firefighter problem asks, for a given B, to maximize the number of vertices that will not catch fire, whereas RMFC (on a tree) asks to find the smallest B that allows for saving all leaves of the tree. Prior to this work, the best known approximation ratios were an O(1)-approximation for the Firefighter problem and an O(log^* n)-approximation for RMFC, both being LP-based and essentially matching the integrality gaps of two natural LP relaxations. We improve on both approximations by presenting a PTAS for the Firefighter problem and an O(1)-approximation for RMFC, both qualitatively matching the known hardness results. Our results are obtained through a combination of the known LPs with several new techniques, which allow for efficiently enumerating over super-constant size sets of constraints to strengthen the natural LPs.

There are two errors in our paper ``Approximating minimum-cost connectivity problems via uncrossable bifamilies'' (ACM Transactions on Algorithms (TALG), 9(1), Article No. 1, 2012). In that paper we consider the (undirected) {\sc Survivable Network} problem. The input consists of a graph $G=(V,E)$ with edge-costs, a set $T \subs V$ of terminals, and connectivity demands $\{r_{st}>0:st \in D \subs T \times T\}$. The goal is to find a minimum cost subgraph $H$ of $G$ that for all $st \in D$ contains $r_{st}$ pairwise internally-disjoint $st$-paths. We claimed ratios $O(k \ln k)$ for rooted demands when the set $D$ of demand pairs forms a star, where $k=\max_{st \in D} r_{st}$ is the maximum demand. This ratio is correct when the requirements are $r_{st}=k$ for all $t \in T \sem \{s\}$, but for general rooted demands our paper implies only ratio $O(k^2)$ (which however is still the currently best known ratio for the problem). We also obtained various ratios for the node-weighted version of the problem. These results are valid, but the proof needs a correction described here.

Chazelle and Matou\v sek [\emph{J. Algorithms}, 1996] presented a derandomization of Clarkson's sampling-based algorithm [\emph{J. ACM}, 1995] for solving linear programs with $n$ constraints and $d$ variables in $d^{(7+o(1))d}n$ deterministic time. The time bound can be improved to $d^{(5+o(1))d}n$ with subsequent work by Br\"onnimann, Chazelle, and Matou\v sek [\emph{SIAM J. Comput.}, 1999]. We first point out a much simpler derandomization of Clarkson's algorithm that avoids $\eps$-approximations and runs in $d^{(3+o(1))d}n$ time. We then describe a few additional ideas that eventually improve the deterministic time bound to $d^{(1/2+o(1))d}n$.

We consider a new construction of locality-sensitive hash functions for Hamming space that is covering in the sense that is it guaranteed to produce a collision for every pair of vectors within a given radius r. The construction is efficient in the sense that the expected number of hash collisions between vectors at distance cr, for a given c>1, comes close to that of the best possible data independent LSH without the covering guarantee, namely, the seminal LSH construction of Indyk and Motwani (STOC '98). The efficiency of the new construction essentially matches their bound when the search radius is not too large --- e.g., when cr = o(log(n)/log log n), where n is the number of points in the data set, and when cr = log(n)/k where k is an integer constant. In general, it differs by at most a factor ln(4) in the exponent of the time bounds. As a consequence, LSH-based similarity search in Hamming space can avoid the problem of false negatives at little or no cost in efficiency.

The SINR model attempts to predict whether a particular transmitter is heard at a specific location, in a setting consisting of n simultaneous transmitters and background noise. The SINR model gives rise to the SINR diagram, which partitions the space into n regions, one per transmitter, and the remaining space where no transmitter can be heard. Point location in the SINR diagram, i.e., determining which transmitter is heard at a query point (if any), has been investigated in several papers. These planar data structures are constructed in time at least quadratic in n and support logarithmic-time approximate queries. Moreover, the performance of some of them depends also on some geometric parameters that cannot be bounded as a function of n or epsilon. In this paper, we address the question of batched point-location queries, i.e., answering many queries simultaneously. In one dimension, we can answer n queries exactly in amortized polylogarithmic time per query, while in the plane we can do it approximately. These results can handle arbitrary power assignments to the transmitters. Moreover, the amortized query time depends only on n and epsilon. We also show how to speed up the preprocessing in a previously proposed point-location structure for uniform-power sites, by almost a full order of magnitude. For this we obtain results on the sensitivity of the reception regions to slight changes in the reception threshold, which are of independent interest. Finally, these results demonstrate the power of combining algebraic tools with those of computational geometry and other fields.

This paper shows the weighted matching problem on general graphs
can be solved in time $O(n(m + n\log n))$ for $n$ and $m$ the
number of vertices and edges, respectively. This was previously known
only for bipartite graphs. The crux is a data structure for blossom creation. It uses a dynamic nearest-common-ancestor algorithm
to simplify blossom steps, so they involve only back edges rather than arbitrary nontree edges.
The rest of the paper presents direct extensions of Edmonds'
blossom algorithm to weighted $b$-matching and $f$-factors. Again
the time bound is the one previously known for
bipartite graphs: for $b$-matching the time is $O(\min\{b(V),n\log
n\}(m + n\log n))$ and for $f$-factors the time is $O(\min {f(V),m\log n\}( m + n\log n) )$,
where $b(V)$ and $f(V)$ denote the sum of all degree constraints.
Several immediate applications of the $f$-factor algorithm
are given: The generalized shortest path structure of \cite{GS13}, i.e., the analog of the shortest path tree for
conservative undirected graphs, is shown to be a version of
the blossom structure for $f$-factors. This structure is found in time $O(|N|(m+n\log n))$ for $N$ the set of negative edges ($0<|N|

We investigate the complexity of several fundamental polynomial-time solvable problems on graphs and matrices of low treewidth. Our goal is to construct algorithms with running time of the form poly(k) * n, where k is the width of a tree decomposition given on input. Such procedures would outperform the best known algorithms already for moderate values of the treewidth, like O(n^{1/c}) for some small constant c. Our results include: - an algorithm for computing the determinant and the rank of an n x n matrix using O(k^3 * n) time and arithmetic operations; - an algorithm for solving a system of linear equations using O(k^3 * n) time and arithmetic operations; - an O(k^3 * n log n)-time randomized algorithm for finding the cardinality of a maximum matching in a graph; - an O(k^4 * n log^2 n)-time randomized algorithm for constructing a maximum matching in a graph; - an O(k^2 * n log n)-time algorithm for finding a maximum vertex flow in a directed graph. Moreover, we give an approximation algorithm for treewidth with suitable time complexity: given a graph G and integer k, in time O(k^7 * n log n) it concludes that G has treewidth larger than k, or constructs a decomposition of width O(k^2). The results stand in contrast with the recent work of Abboud et al. [SODA 2016], which shows that the algorithms with similar running times are unlikely for the problems of finding the diameter and radius of a graph of low treewidth.

Randomization is a fundamental tool used in many theoretical and practical areas of computer science. We study here the role of randomization in the area of submodular function maximization. In this area most algorithms are randomized, and in almost all cases the approximation ratios obtained by current randomized algorithms are superior to the best results obtained by known deterministic algorithms. Derandomization of algorithms for general submodular function maximization seems hard since the access to the function is done via a value oracle. This makes it hard, for example, to apply standard derandomization techniques such as conditional expectations. Therefore, an interesting fundamental problem in this area is whether randomization is inherently necessary for obtaining good approximation ratios. In this work we give evidence that randomization is not necessary for obtaining good algorithms by presenting a new technique for derandomization of algorithms for submodular function maximization. Our high level idea is to maintain explicitly a (small) distribution over the states of the algorithm, and carefully update it using marginal values obtained from an extreme point solution of a suitable linear formulation. We demonstrate our technique on two recent algorithms for unconstrained submodular maximization and for maximizing submodular function subject to a cardinality constraint. In particular, for unconstrained submodular maximization we obtain an optimal deterministic $1/2$-approximation showing that randomization is unnecessary for obtaining optimal results for this setting.

Let V be a set of n points in R^d, which we call voters, where d is a fixed constant. A point p in R^d is preferred over another point p' in R^d by a voter v in V if dist(v,p) < dist(v,p'). A point p is called a plurality point if it is preferred by at least as many voters as any other point p'. We present an algorithm that decides in O(n log n) time whether V admits a plurality point in the L_2 norm and, if so, finds the (unique) plurality point. We also give efficient algorithms to compute the smallest subset W of V such that V - W admits a plurality point, and to compute a so-called minimum-radius plurality ball. Finally, we consider the problem in the personalized L_1 norm, where each point v in V has a preference vector

We study the problem of detecting {\em outlier pairs} of strongly correlated variables among a collection of $n$ variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of $n$ vectors with unit Euclidean norm and dimension $d$, and we are asked to find all the outlier pairs of vectors whose inner product is at least $\rho$ in absolute value, subject to the promise that all but at most $q$ pairs of vectors have inner product at most $\tau$ in absolute value for some constants $0<\tau<\rho<1$. Improving on an algorithm of G.~Valiant [FOCS~2012; J.\,ACM~2015], we present a randomized algorithm that for Boolean inputs ($\{-1,1\}$-valued data normalized to unit Euclidean length) runs in time \[ \tilde O\bigl(n^{\max\,\{1-\gamma+M(\Delta\gamma,\gamma),\,M(1-\gamma,2\Delta\gamma)\}}+qdn^{2\gamma}\bigr)\,, \] where $0<\gamma<1$ is a constant tradeoff parameter and $M(\mu,\nu)$ is the exponent to multiply an $\lfloor n^\mu\rfloor\times\lfloor n^\nu\rfloor$ matrix with an $\lfloor n^\nu\rfloor\times \lfloor n^\mu\rfloor$ matrix and $\Delta=1/(1-\log_\tau\rho)$. As corollaries we obtain randomized algorithms that run in time \[ \tilde O\bigl(n^{\frac{2\omega}{3-\log_\tau\rho}}+qdn^{\frac{2(1-\log_\tau\rho)}{3-\log_\tau\rho}}\bigr) \] and in time \[ \tilde O\bigl(n^{\frac{4}{2+\alpha(1-\log_\tau\rho)}}+qdn^{\frac{2\alpha(1-\log_\tau\rho)}{2+\alpha(1-\log_\tau\rho)}}\bigr)\,, \] where $2\leq\omega<2.38$ is the exponent for square matrix multiplication and $0.3<\alpha\leq 1$ is the exponent for\, rectangular matrix multiplication. We present further corollaries for the light bulb problem and for learning sparse Boolean functions. (The notation {$\tilde O(\cdot)$} hides polylogarithmic factors in $n$ and $d$ whose degree may depend on $\rho$ and $\tau$.)

We consider the problems of online and stochastic packet queuing in a distributed system of n nodes with queues, where the communication between the nodes is done via a multiple access channel. In the online setting, in each round, an arbitrary number of packets can be injected into the system, each to an arbitrary node's queue. Two measures of performance are considered: the total number of packets in the system, called the total load, and the maximum queue size, called the maximum load. We develop a deterministic distributed algorithm that is asymptotically optimal with respect to both complexity measures, in a competitive way. More precisely, the total load of our algorithm is bigger than the total load of any other algorithm, including centralized online solutions, by only an additive term of O(n^2), while the maximum queue size of our algorithm is at most n times bigger than the maximum queue size of any other algorithm, with an extra additive O(n). The optimality for both measures is justified by proving the corresponding lower bounds, which also separates nearly exponentially distributed solutions from the centralized ones. Next, we show that our algorithm is also stochastically stable for any expected injection rate smaller or equal to 1. This is the first solution to the stochastic queuing problem on a multiple access channel that achieves such stability for the (highest possible) rate equal to 1.

Properties definable in first-order logic are algorithmically interesting for both theoretical and pragmatic reasons. Many of the most studied algorithmic problems, such as Hitting Set and Orthogonal Vectors, are first-order, and the first-order properties naturally arise as relational database queries. A relatively straightforward algorithm for evaluating a property with k+1 quantifiers takes time $O(m^k)$ and, assuming the Strong Exponential Time Hypothesis (SETH), some such properties require $O(m^{k-\epsilon})$ time for any $\epsilon > 0$. (Here, m represents the size of the input structure, i.e. the number of tuples in all relations.) We give algorithms for every first-order property that improves this upper bound to $m^k/2^{\Theta(\sqrt{\log n})}$, i.e., an improvement by a factor more than any poly-log, but less than the polynomial required to refute SETH. Moreover, we show that further improvement is equivalent to improving algorithms for sparse instances of the well-studied Orthogonal Vectors problem. Surprisingly, both results are obtained by showing completeness of the Sparse Orthogonal Vectors problem for the class of first-order properties under fine-grained reductions. To obtain improved algorithms, we apply the fast Orthogonal Vectors algorithm of [AWY15,CW16]. While fine-grained reductions (reductions that closely preserve the conjectured complexities of problems) have been used to relate the hardness of disparate specific problems both within P and beyond, this is the first such completeness result for a standard complexity class.

A simple digraph is semi-complete if for any two of its vertices u and v, at least one of the arcs (u,v) and (v,u) is present. We study the complexity of computing two layout parameters of semi-complete digraphs: cutwidth and optimal linear arrangement (Ola). We prove that: " Both parameters are NP-hard to compute and the known exact and parameterized algorithms for them have essentially optimal running times, assuming the Exponential Time Hypothesis. " The cutwidth parameter admits a quadratic Turing kernel, whereas it does not admit any polyno- mial kernel unless NP coNP/poly. By contrast, Ola admits a linear kernel. These results essentially complete the complexity analysis of computing cutwidth and Ola on semi- complete digraphs. Our techniques can be also used to analyze the sizes of minimal obstructions for having small cutwidth under the induced subdigraph relation.

We study the problem of $2$-dimensional orthogonal range counting with additive error. Given a set $P$ of $n$ points drawn from an $n\times n$ grid and an error parameter $\eps$, the goal is to build a data structure, such that for any orthogonal range $R$, it can return the number of points in $P\cap R$ with additive error $\eps n$. A well-known solution for this problem is the {\em $\eps$-approximation}, which is a subset $A\subseteq P$ that can estimate the number of points in $P\cap R$ with the number of points in $A\cap R$. It is known that an $\eps$-approximation of size $O(\frac{1}{\eps} \log^{2.5} \frac{1}{\eps})$ exists for any $P$ with respect to orthogonal ranges, and the best lower bound is $\Omega(\frac{1}{\eps} \log \frac{1}{\eps})$. The $\eps$-approximation is a rather restricted data structure, as we are not allowed to store any information other than the coordinates of the points in $P$. In this paper, we explore what can be achieved without any restriction on the data structure. We first describe a simple data structure that uses $O(\frac{1}{\eps}(\log^2\frac{1} {\eps} + \log n) )$ bits and answers queries with error $\eps n$. We then prove a lower bound that any data structure that answers queries with error $\eps n$ must use $\Omega(\frac{1}{\eps}(\log^2\frac{1} {\eps} + \log n) )$ bits. Our lower bound is information-theoretic: We show that there is a collection of $2^{\Omega(n\log n)}$ point sets with large {\em union combinatorial discrepancy}, and thus are hard to distinguish unless we use $\Omega(n\log n)$ bits.

In the Interval Completion problem we are given an n-vertex graph G and an integer k, and the task is to transform G by making use of at most k edge additions into an interval graph. This is a fundamental graph modifi cation problem with applications in sparse matrix multiplication and molecular biology. The question about fi xed-parameter tractability of Interval Completion was asked by Kaplan, Shamir and Tarjan [FOCS 1994; SIAM J. Comput. 1999] and was answered a rmatively more than a decade later by Villanger at el. [STOC 2007; SIAM J. Comput. 2009], who presented an algorithm with running time O(k^{2k} n^3 m). We give the first subexponential parameterized algorithm solving Interval Completion in time k^O(sqrt(k)) n^O(1). This adds Interval Completion to a very small list of parameterized graph modi fication problems solvable in subexponential time.

How efficiently can we find an unknown graph using distance or shortest path queries between its vertices? We assume that the unknown graph G is connected, unweighted, and has bounded degree. In the reconstruction problem, the goal is to find the graph G. In the verification problem, we are given a hypothetical graph $\hat G$ and want to check whether G is equal to $\hat G$. We provide a randomized algorithm for reconstruction using $\tilde O(n^{3/2})$ distance queries, based on Voronoi cell decomposition. Next, we analyze natural greedy algorithms for reconstruction using a shortest path oracle and also for verification using either oracle, and show that their query complexity is $n^{1+o(1)}$. We further improve the query complexity when the graph is chordal or outerplanar. Finally, we show some lower bounds, and consider an approximate version of the reconstruction problem.

The Weighted Tree Augmentation Problem (WTAP) is a fundamental well-studied problem in the field of network design. Given an undirected tree $G=(V,E)$, an additional set of edges $L \subseteq V\times V$ disjoint from $E$ called \textit{links} and a cost vector $c\in \mathbb{R}_{\geq 0}^L$, WTAP asks to find a minimum-cost set $F\subseteq L$ with the property that $(V,E\cup F)$ is $2$-edge connected. The special case where $c_\ell = 1$ for all $\ell\in L$ is called the Tree Augmentation Problem (TAP). For the class of bounded cost vectors, we present a first improved approximation algorithm for WTAP since more than three decades. Concretely, for any $M\in \mathbb{R}_{\geq 1}$ and $\epsilon > 0$, we present an LP based $(\delta+\epsilon)$-approximation for WTAP restricted to cost vectors $c$ in $[1,M]^L$ for $\delta \approx 1.96417$. More generally, our result is a $(\delta+\epsilon)$-approximation algorithm with running time $n^{r^{O(1)}}$, where $r = c_{\max}/c_{\min}$ is the ratio between the largest and the smallest cost of any link. For the special case of TAP we improve this factor to $\frac{5}{3}+\epsilon$. Our results rely on several new ideas, including a new LP relaxation of WTAP and a two-phase rounding algorithm.

A spanner H of a weighted undirected graph G is a ``sparse" subgraph that approximately preserves distances between every pair of vertices in G. We refer to H as a \delta-spanner of G for some parameter \delta\geq 1 if the distance in H between every vertex pair is at most a factor \delta bigger than in G. In this case, we say that H has stretch \delta. Two main measures of the sparseness of a spanner are the size (number of edges) and the total weight (the sum of weights of the edges in the spanner). Recently Elkin, Neiman and Solomon [ICALP 14] presented an improved analysis of the greedy algorithm, proving that the greedy algorithm admits (2k-1) \cdot (1+\eps) stretch and total edge weight of O_{\eps}((k/\log{k}) \cdot \omega(MST(G))\cdot n^{1/k}), where \omega(MST(G)) is the weight of a minimum spanning tree of G. The previous analysis by Chandra \etal [SOCG 92] admitted (2k-1)\cdot (1+\eps) stretch and total edge weight of O_{\eps}(k \omega(MST(G)) n^{1/k}). Hence, Elkin \etal improved the weight of the spanner by a \log{k} factor. In this work, we completely remove the k factor from the weight, presenting a spanner with (2k-1)\cdot (1+\eps) stretch, O_{\eps}(\omega(MST(G)) n^{1/k}) total weight, and O(n^{1+1/k}) edges. Up to a $(1+\eps)$ factor in the stretch this matches the girth conjecture of Erd\H{o}s \cite{Er64}.