#### Approximation Guarantees for the Minimum Linear Arrangement Problem by Higher Eigenvalues

Suguru Tamaki (Kyoto University); Yuichi Yoshida (National Institute of Informatics, and Preferred Infrastructure, Inc.)The problem of finding large cliques in random graphs and its ``planted" variant, where one wants to recover a clique of size $\omega \gg \log{(n)}$ added to an \Erdos-\Renyi graph $G \sim G(n,\frac{1}{2})$, have been intensely studied. Nevertheless, existing polynomial time algorithms can only recover planted cliques of size $\omega = \Omega(\sqrt{n})$. By contrast, information theoretically, one can recover planted cliques so long as $\omega \gg \log{(n)}$. In this work, we continue the investigation of algorithms from the sum of squares hierarchy for solving the planted clique problem begun by Meka, Potechin, and Wigderson \cite{MPW15} and Deshpande and Montanari \cite{DM15}. Our main result is that degree four SoS does not recover the planted clique unless $\omega \gg \sqrt n / \polylog n$, improving upon the bound $\omega \gg n^{1/3}$ due to \cite{DM15}. An argument of Kelner shows that the this result cannot be proved using the same certificate as prior works. Rather, our proof involves constructing and analyzing a new certificate that yields the nearly tight lower bound by ``correcting" the certificate of \cite{MPW15,DM15,FeigeK03}.

The Firefighter problem and a variant of it, known as Resource Minimization for Fire Containment (RMFC), are natural models for optimal inhibition of harmful spreading processes. Despite considerable progress on several fronts, the approximability of these problems is still badly understood. This is the case even when the underlying graph is a tree, which is one of the most-studied graph structures in this context and the focus of this paper. In their simplest version, a fire spreads from one fixed vertex step by step from burning to adjacent non-burning vertices, and at each time step B many non-burning vertices can be protected from catching fire. The Firefighter problem asks, for a given B, to maximize the number of vertices that will not catch fire, whereas RMFC (on a tree) asks to find the smallest B that allows for saving all leaves of the tree. Prior to this work, the best known approximation ratios were an O(1)-approximation for the Firefighter problem and an O(log^* n)-approximation for RMFC, both being LP-based and essentially matching the integrality gaps of two natural LP relaxations. We improve on both approximations by presenting a PTAS for the Firefighter problem and an O(1)-approximation for RMFC, both qualitatively matching the known hardness results. Our results are obtained through a combination of the known LPs with several new techniques, which allow for efficiently enumerating over super-constant size sets of constraints to strengthen the natural LPs.

There are two errors in our paper ``Approximating minimum-cost connectivity problems via uncrossable bifamilies'' (ACM Transactions on Algorithms (TALG), 9(1), Article No. 1, 2012). In that paper we consider the (undirected) {\sc Survivable Network} problem. The input consists of a graph $G=(V,E)$ with edge-costs, a set $T \subs V$ of terminals, and connectivity demands $\{r_{st}>0:st \in D \subs T \times T\}$. The goal is to find a minimum cost subgraph $H$ of $G$ that for all $st \in D$ contains $r_{st}$ pairwise internally-disjoint $st$-paths. We claimed ratios $O(k \ln k)$ for rooted demands when the set $D$ of demand pairs forms a star, where $k=\max_{st \in D} r_{st}$ is the maximum demand. This ratio is correct when the requirements are $r_{st}=k$ for all $t \in T \sem \{s\}$, but for general rooted demands our paper implies only ratio $O(k^2)$ (which however is still the currently best known ratio for the problem). We also obtained various ratios for the node-weighted version of the problem. These results are valid, but the proof needs a correction described here.

The SINR model attempts to predict whether a particular transmitter is heard at a specific location, in a setting consisting of n simultaneous transmitters and background noise. The SINR model gives rise to the SINR diagram, which partitions the space into n regions, one per transmitter, and the remaining space where no transmitter can be heard. Point location in the SINR diagram, i.e., determining which transmitter is heard at a query point (if any), has been investigated in several papers. These planar data structures are constructed in time at least quadratic in n and support logarithmic-time approximate queries. Moreover, the performance of some of them depends also on some geometric parameters that cannot be bounded as a function of n or epsilon. In this paper, we address the question of batched point-location queries, i.e., answering many queries simultaneously. In one dimension, we can answer n queries exactly in amortized polylogarithmic time per query, while in the plane we can do it approximately. These results can handle arbitrary power assignments to the transmitters. Moreover, the amortized query time depends only on n and epsilon. We also show how to speed up the preprocessing in a previously proposed point-location structure for uniform-power sites, by almost a full order of magnitude. For this we obtain results on the sensitivity of the reception regions to slight changes in the reception threshold, which are of independent interest. Finally, these results demonstrate the power of combining algebraic tools with those of computational geometry and other fields.

This paper shows the weighted matching problem on general graphs
can be solved in time $O(n(m + n\log n))$ for $n$ and $m$ the
number of vertices and edges, respectively. This was previously known
only for bipartite graphs. The crux is a data structure for blossom creation. It uses a dynamic nearest-common-ancestor algorithm
to simplify blossom steps, so they involve only back edges rather than arbitrary nontree edges.
The rest of the paper presents direct extensions of Edmonds'
blossom algorithm to weighted $b$-matching and $f$-factors. Again
the time bound is the one previously known for
bipartite graphs: for $b$-matching the time is $O(\min\{b(V),n\log
n\}(m + n\log n))$ and for $f$-factors the time is $O(\min {f(V),m\log n\}( m + n\log n) )$,
where $b(V)$ and $f(V)$ denote the sum of all degree constraints.
Several immediate applications of the $f$-factor algorithm
are given: The generalized shortest path structure of \cite{GS13}, i.e., the analog of the shortest path tree for
conservative undirected graphs, is shown to be a version of
the blossom structure for $f$-factors. This structure is found in time $O(|N|(m+n\log n))$ for $N$ the set of negative edges ($0<|N|

We provide evidence that computing the maximum flow value between every pair of nodes in a directed graph on $n$ nodes, $m$ edges, and capacities in the range $[1..n]$, which we call the All-Pairs Max-Flow problem, cannot be solved in time that is faster significantly than $O(n^2 m)$. Since a single maximum $st$-flow in such graphs can be solved in time $\tO(m\sqrt{n})$ [Lee and Sidford, FOCS 2014], we conclude that the all-pairs version might require time equivalent to $\tilde\Omega(n^{3/2})$ computations of maximum $st$-flow, which strongly separates the directed case from the undirected one. Moreover, if maximum $st$-flow can be solved in time $\tO(m)$, then the runtime of $\tilde\Omega(n^2)$ computations is needed. This is in contrast to a conjecture of Lacki, Nussbaum, Sankowski, and Wulf-Nilsen [FOCS 2012] that All-Pairs Max-Flow in general graphs can be solved faster than the time of $O(n^2)$ computations of maximum $st$-flow. Specifically, we show that in sparse graphs $G=(V,E,w)$, if one can compute All-Pairs Max-Flow in time $O((n^2 m)^{1-\varepsilon})$, for some constant $\varepsilon>0$, then MAX-CNF-SAT with $n'$ variables and $m'$ clauses can be solved in time ${m'}^{O(1)}2^{(1-\delta)n'}$ for a constant $\delta(\varepsilon)>0$, a problem for which not even $2^{n'}/\poly(n')$ algorithms are known. Such runtime for MAX-CNF-SAT would in particular refute the Strong Exponential Time Hypothesis (SETH). Hence, we improve the lower bound of Abboud, Vassilevska-Williams, and Yu [STOC 2015], who showed that for every fixed $\varepsilon>0$ and $\card{S}=\card{T}=O(\sqrt{n})$, if the above problem can be solved in time $O(n^{3/2-\varepsilon})$, then some incomparable (and intuitively weaker) conjecture is false.

We investigate the complexity of several fundamental polynomial-time solvable problems on graphs and matrices of low treewidth. Our goal is to construct algorithms with running time of the form poly(k) * n, where k is the width of a tree decomposition given on input. Such procedures would outperform the best known algorithms already for moderate values of the treewidth, like O(n^{1/c}) for some small constant c. Our results include: - an algorithm for computing the determinant and the rank of an n x n matrix using O(k^3 * n) time and arithmetic operations; - an algorithm for solving a system of linear equations using O(k^3 * n) time and arithmetic operations; - an O(k^3 * n log n)-time randomized algorithm for finding the cardinality of a maximum matching in a graph; - an O(k^4 * n log^2 n)-time randomized algorithm for constructing a maximum matching in a graph; - an O(k^2 * n log n)-time algorithm for finding a maximum vertex flow in a directed graph. Moreover, we give an approximation algorithm for treewidth with suitable time complexity: given a graph G and integer k, in time O(k^7 * n log n) it concludes that G has treewidth larger than k, or constructs a decomposition of width O(k^2). The results stand in contrast with the recent work of Abboud et al. [SODA 2016], which shows that the algorithms with similar running times are unlikely for the problems of finding the diameter and radius of a graph of low treewidth.

We show how to compute for n-vertex planar graphs in O(n^{11/6} polylog(n)) expected time the diameter and the sum of the pairwise distances. The algorithms work for directed graphs with real weights and no negative cycles. In O(n^{15/8} polylog(n)) expected time we can also compute the number of pairs of vertices at distance smaller than a given threshold, These are the first algorithms for these problems using time O(n^c) for some constant c < 2, even when restricted to undirected, unweighted planar graphs.

Randomization is a fundamental tool used in many theoretical and practical areas of computer science. We study here the role of randomization in the area of submodular function maximization. In this area most algorithms are randomized, and in almost all cases the approximation ratios obtained by current randomized algorithms are superior to the best results obtained by known deterministic algorithms. Derandomization of algorithms for general submodular function maximization seems hard since the access to the function is done via a value oracle. This makes it hard, for example, to apply standard derandomization techniques such as conditional expectations. Therefore, an interesting fundamental problem in this area is whether randomization is inherently necessary for obtaining good approximation ratios. In this work we give evidence that randomization is not necessary for obtaining good algorithms by presenting a new technique for derandomization of algorithms for submodular function maximization. Our high level idea is to maintain explicitly a (small) distribution over the states of the algorithm, and carefully update it using marginal values obtained from an extreme point solution of a suitable linear formulation. We demonstrate our technique on two recent algorithms for unconstrained submodular maximization and for maximizing submodular function subject to a cardinality constraint. In particular, for unconstrained submodular maximization we obtain an optimal deterministic $1/2$-approximation showing that randomization is unnecessary for obtaining optimal results for this setting.

Let V be a set of n points in R^d, which we call voters, where d is a fixed constant. A point p in R^d is preferred over another point p' in R^d by a voter v in V if dist(v,p) < dist(v,p'). A point p is called a plurality point if it is preferred by at least as many voters as any other point p'. We present an algorithm that decides in O(n log n) time whether V admits a plurality point in the L_2 norm and, if so, finds the (unique) plurality point. We also give efficient algorithms to compute the smallest subset W of V such that V - W admits a plurality point, and to compute a so-called minimum-radius plurality ball. Finally, we consider the problem in the personalized L_1 norm, where each point v in V has a preference vector

We study the problem of detecting {\em outlier pairs} of strongly correlated variables among a collection of $n$ variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of $n$ vectors with unit Euclidean norm and dimension $d$, and we are asked to find all the outlier pairs of vectors whose inner product is at least $\rho$ in absolute value, subject to the promise that all but at most $q$ pairs of vectors have inner product at most $\tau$ in absolute value for some constants $0<\tau<\rho<1$. Improving on an algorithm of G.~Valiant [FOCS~2012; J.\,ACM~2015], we present a randomized algorithm that for Boolean inputs ($\{-1,1\}$-valued data normalized to unit Euclidean length) runs in time \[ \tilde O\bigl(n^{\max\,\{1-\gamma+M(\Delta\gamma,\gamma),\,M(1-\gamma,2\Delta\gamma)\}}+qdn^{2\gamma}\bigr)\,, \] where $0<\gamma<1$ is a constant tradeoff parameter and $M(\mu,\nu)$ is the exponent to multiply an $\lfloor n^\mu\rfloor\times\lfloor n^\nu\rfloor$ matrix with an $\lfloor n^\nu\rfloor\times \lfloor n^\mu\rfloor$ matrix and $\Delta=1/(1-\log_\tau\rho)$. As corollaries we obtain randomized algorithms that run in time \[ \tilde O\bigl(n^{\frac{2\omega}{3-\log_\tau\rho}}+qdn^{\frac{2(1-\log_\tau\rho)}{3-\log_\tau\rho}}\bigr) \] and in time \[ \tilde O\bigl(n^{\frac{4}{2+\alpha(1-\log_\tau\rho)}}+qdn^{\frac{2\alpha(1-\log_\tau\rho)}{2+\alpha(1-\log_\tau\rho)}}\bigr)\,, \] where $2\leq\omega<2.38$ is the exponent for square matrix multiplication and $0.3<\alpha\leq 1$ is the exponent for\, rectangular matrix multiplication. We present further corollaries for the light bulb problem and for learning sparse Boolean functions. (The notation {$\tilde O(\cdot)$} hides polylogarithmic factors in $n$ and $d$ whose degree may depend on $\rho$ and $\tau$.)

A filtration over a simplicial complex K is an ordering of the simplices of K such that all prefixes in the ordering are subcomplexes of K. Filtrations are at the core of Persistent Homology, a major tool in Topological Data Analysis. In order to represent the filtration of a simplicial complex, the entire filtration can be appended to any data structure that explicitly stores all the simplices of the complex such as the Hasse diagram or the recently introduced Simplex Tree [Algorithmica '14]. However, with the popularity of various computational methods that need to handle simplicial complexes, and with the rapidly increasing size of the complexes, the task of finding a compact data structure that can still support efficient queries is of great interest. In this paper, we propose a new data structure called the Critical Simplex Diagram (CSD) which is a variant of the Simplex Array List (SAL) [SoCG '15]. Our data structure allows to store in a compact way the filtration of a simplicial complex, and allows for the efficient implementation of a large range of basic operations. Moreover, we prove that our data structure is essentially optimal with respect to the requisite storage space. Finally, we show that the CSD representation admits fast construction algorithms for Flag complexes and relaxed Delaunay complexes.

Properties definable in first-order logic are algorithmically interesting for both theoretical and pragmatic reasons. Many of the most studied algorithmic problems, such as Hitting Set and Orthogonal Vectors, are first-order, and the first-order properties naturally arise as relational database queries. A relatively straightforward algorithm for evaluating a property with k+1 quantifiers takes time $O(m^k)$ and, assuming the Strong Exponential Time Hypothesis (SETH), some such properties require $O(m^{k-\epsilon})$ time for any $\epsilon > 0$. (Here, m represents the size of the input structure, i.e. the number of tuples in all relations.) We give algorithms for every first-order property that improves this upper bound to $m^k/2^{\Theta(\sqrt{\log n})}$, i.e., an improvement by a factor more than any poly-log, but less than the polynomial required to refute SETH. Moreover, we show that further improvement is equivalent to improving algorithms for sparse instances of the well-studied Orthogonal Vectors problem. Surprisingly, both results are obtained by showing completeness of the Sparse Orthogonal Vectors problem for the class of first-order properties under fine-grained reductions. To obtain improved algorithms, we apply the fast Orthogonal Vectors algorithm of [AWY15,CW16]. While fine-grained reductions (reductions that closely preserve the conjectured complexities of problems) have been used to relate the hardness of disparate specific problems both within P and beyond, this is the first such completeness result for a standard complexity class.

A simple digraph is semi-complete if for any two of its vertices u and v, at least one of the arcs (u,v) and (v,u) is present. We study the complexity of computing two layout parameters of semi-complete digraphs: cutwidth and optimal linear arrangement (Ola). We prove that: " Both parameters are NP-hard to compute and the known exact and parameterized algorithms for them have essentially optimal running times, assuming the Exponential Time Hypothesis. " The cutwidth parameter admits a quadratic Turing kernel, whereas it does not admit any polyno- mial kernel unless NP coNP/poly. By contrast, Ola admits a linear kernel. These results essentially complete the complexity analysis of computing cutwidth and Ola on semi- complete digraphs. Our techniques can be also used to analyze the sizes of minimal obstructions for having small cutwidth under the induced subdigraph relation.

We study the problem of $2$-dimensional orthogonal range counting with additive error. Given a set $P$ of $n$ points drawn from an $n\times n$ grid and an error parameter $\eps$, the goal is to build a data structure, such that for any orthogonal range $R$, it can return the number of points in $P\cap R$ with additive error $\eps n$. A well-known solution for this problem is the {\em $\eps$-approximation}, which is a subset $A\subseteq P$ that can estimate the number of points in $P\cap R$ with the number of points in $A\cap R$. It is known that an $\eps$-approximation of size $O(\frac{1}{\eps} \log^{2.5} \frac{1}{\eps})$ exists for any $P$ with respect to orthogonal ranges, and the best lower bound is $\Omega(\frac{1}{\eps} \log \frac{1}{\eps})$. The $\eps$-approximation is a rather restricted data structure, as we are not allowed to store any information other than the coordinates of the points in $P$. In this paper, we explore what can be achieved without any restriction on the data structure. We first describe a simple data structure that uses $O(\frac{1}{\eps}(\log^2\frac{1} {\eps} + \log n) )$ bits and answers queries with error $\eps n$. We then prove a lower bound that any data structure that answers queries with error $\eps n$ must use $\Omega(\frac{1}{\eps}(\log^2\frac{1} {\eps} + \log n) )$ bits. Our lower bound is information-theoretic: We show that there is a collection of $2^{\Omega(n\log n)}$ point sets with large {\em union combinatorial discrepancy}, and thus are hard to distinguish unless we use $\Omega(n\log n)$ bits.

In the Interval Completion problem we are given an n-vertex graph G and an integer k, and the task is to transform G by making use of at most k edge additions into an interval graph. This is a fundamental graph modifi cation problem with applications in sparse matrix multiplication and molecular biology. The question about fi xed-parameter tractability of Interval Completion was asked by Kaplan, Shamir and Tarjan [FOCS 1994; SIAM J. Comput. 1999] and was answered a rmatively more than a decade later by Villanger at el. [STOC 2007; SIAM J. Comput. 2009], who presented an algorithm with running time O(k^{2k} n^3 m). We give the first subexponential parameterized algorithm solving Interval Completion in time k^O(sqrt(k)) n^O(1). This adds Interval Completion to a very small list of parameterized graph modi fication problems solvable in subexponential time.

How efficiently can we find an unknown graph using distance or shortest path queries between its vertices? We assume that the unknown graph G is connected, unweighted, and has bounded degree. In the reconstruction problem, the goal is to find the graph G. In the verification problem, we are given a hypothetical graph $\hat G$ and want to check whether G is equal to $\hat G$. We provide a randomized algorithm for reconstruction using $\tilde O(n^{3/2})$ distance queries, based on Voronoi cell decomposition. Next, we analyze natural greedy algorithms for reconstruction using a shortest path oracle and also for verification using either oracle, and show that their query complexity is $n^{1+o(1)}$. We further improve the query complexity when the graph is chordal or outerplanar. Finally, we show some lower bounds, and consider an approximate version of the reconstruction problem.

The Weighted Tree Augmentation Problem (WTAP) is a fundamental well-studied problem in the field of network design. Given an undirected tree $G=(V,E)$, an additional set of edges $L \subseteq V\times V$ disjoint from $E$ called \textit{links} and a cost vector $c\in \mathbb{R}_{\geq 0}^L$, WTAP asks to find a minimum-cost set $F\subseteq L$ with the property that $(V,E\cup F)$ is $2$-edge connected. The special case where $c_\ell = 1$ for all $\ell\in L$ is called the Tree Augmentation Problem (TAP). For the class of bounded cost vectors, we present a first improved approximation algorithm for WTAP since more than three decades. Concretely, for any $M\in \mathbb{R}_{\geq 1}$ and $\epsilon > 0$, we present an LP based $(\delta+\epsilon)$-approximation for WTAP restricted to cost vectors $c$ in $[1,M]^L$ for $\delta \approx 1.96417$. More generally, our result is a $(\delta+\epsilon)$-approximation algorithm with running time $n^{r^{O(1)}}$, where $r = c_{\max}/c_{\min}$ is the ratio between the largest and the smallest cost of any link. For the special case of TAP we improve this factor to $\frac{5}{3}+\epsilon$. Our results rely on several new ideas, including a new LP relaxation of WTAP and a two-phase rounding algorithm.

We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of d doors that are initially closed. To open a door, the algorithm knocks on it and it might open or not according to some probability distribution. This distribution may depend on which other doors are currently open, as well as on which other doors were open during each of the previous knocks on that door. The algorithm aims to minimize the expected time until all doors open without knowing whether or which other doors have already opened. Here, we focus on scenarios where dependencies are positively correlated and acyclic. The fundamental distribution of a door describes the probability it opens in the best of conditions. We show that if in two configurations corresponding doors share the same fundamental distribution, then these configurations have the same optimal running time up to a universal constant, no matter what are the dependencies between doors and what are the distributions. We also identify algorithms that are optimal up to a universal constant factor. We then turn our attention to investigate precise bounds. Even for the case of two doors, identifying the optimal sequence is an intriguing combinatorial question. Here, we study the case of two cascading memoryless doors and solve it almost completely.

A spanner H of a weighted undirected graph G is a ``sparse" subgraph that approximately preserves distances between every pair of vertices in G. We refer to H as a \delta-spanner of G for some parameter \delta\geq 1 if the distance in H between every vertex pair is at most a factor \delta bigger than in G. In this case, we say that H has stretch \delta. Two main measures of the sparseness of a spanner are the size (number of edges) and the total weight (the sum of weights of the edges in the spanner). Recently Elkin, Neiman and Solomon [ICALP 14] presented an improved analysis of the greedy algorithm, proving that the greedy algorithm admits (2k-1) \cdot (1+\eps) stretch and total edge weight of O_{\eps}((k/\log{k}) \cdot \omega(MST(G))\cdot n^{1/k}), where \omega(MST(G)) is the weight of a minimum spanning tree of G. The previous analysis by Chandra \etal [SOCG 92] admitted (2k-1)\cdot (1+\eps) stretch and total edge weight of O_{\eps}(k \omega(MST(G)) n^{1/k}). Hence, Elkin \etal improved the weight of the spanner by a \log{k} factor. In this work, we completely remove the k factor from the weight, presenting a spanner with (2k-1)\cdot (1+\eps) stretch, O_{\eps}(\omega(MST(G)) n^{1/k}) total weight, and O(n^{1+1/k}) edges. Up to a $(1+\eps)$ factor in the stretch this matches the girth conjecture of Erd\H{o}s \cite{Er64}.