ACM Transactions on

Algorithms (TALG)

Latest Articles

Graph Reconstruction and Verification

How efficiently can we find an unknown graph using distance or shortest path queries between its vertices? We assume that the unknown graph G is connected, unweighted, and has bounded degree. In the reconstruction problem, the goal is to find the graph G. In the verification problem, we are given a hypothetical graph Ĝ and want to check... (more)

Batched Point Location in SINR Diagrams via Algebraic Tools

The SINR (Signal to Interference plus Noise Ratio) model for the quality of wireless connections has been the subject of extensive recent study. It... (more)

Conditional Lower Bounds for All-Pairs Max-Flow

We provide evidence that computing the maximum flow value between every pair of nodes in a directed graph on n nodes, m edges, and capacities in the range [1‥n], which we call the All-Pairs Max-Flow problem, cannot be solved in time that is significantly faster (i.e., by a polynomial factor) than O(n3) even for sparse graphs, namely... (more)

The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback

We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of d doors that are initially closed, and the aim is to open all of them as fast as... (more)

An Efficient Representation for Filtrations of Simplicial Complexes

A filtration over a simplicial complex K is an ordering of the simplices of K such that all prefixes in the ordering are subcomplexes of K.... (more)

Approximation Guarantees for the Minimum Linear Arrangement Problem by Higher Eigenvalues

Given an n-vertex undirected graph G = (V,E) and positive edge weights {we}e∈E, a linear arrangement is a permutation π : V... (more)

An Efficient Algorithm for Computing High-Quality Paths amid Polygonal Obstacles

We study a path-planning problem amid a set O of obstacles in R2, in which we wish to compute a short path between two points while also maintaining a... (more)

Deterministic Parallel Algorithms for Fooling Polylogarithmic Juntas and the Lovász Local Lemma

Many randomized algorithms can be derandomized efficiently using either the method of conditional... (more)

Streaming Algorithms for Estimating the Matching Size in Planar Graphs and Beyond

We consider the problem of estimating the size of a maximum matching when the edges are revealed in a streaming fashion. When the input graph is... (more)

Adaptive Computation of the Swap-Insert Correction Distance

The Swap-Insert Correction distance from a string S of length n to another string L of length m ≥ n on the alphabet [1‥σ] is... (more)

Dynamic Time Warping and Geometric Edit Distance: Breaking the Quadratic Barrier

Dynamic Time Warping (DTW) and Geometric Edit Distance (GED) are basic similarity measures between curves or general temporal sequences (e.g., time series) that are represented as sequences of points in some metric space (X, dist). The DTW and GED measures are massively used in various fields of computer science and computational biology.... (more)

Packing Groups of Items into Multiple Knapsacks

We consider a natural generalization of the classical multiple knapsack problem in which instead of packing single items we are packing groups of items. In this problem, we have multiple knapsacks and a set of items partitioned into groups. Each item has an individual weight, while the profit is associated with groups rather than items. The profit... (more)


In Memoriam: David S. Johnson

About TALG

The ACM Transactions on Algorithms (TALG) publishes original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis.

read more
The Simplex Algorithm is NP-mighty

We show that the Simplex Method, the Network Simplex Method  both with Dantzigs original pivot rule  and the Successive Shortest Path Algorithm are NP-mighty. That is, each of these algorithms can be used to solve, with polynomial overhead, any problem in NP implicitly during the algorithms execution. This result casts a more favorable light on these algorithms exponential worst-case running times. Furthermore, as a consequence of our approach, we obtain several novel hardness results. For example, for a given input to the Simplex Algorithm, deciding whether a given variable ever enters the basis during the algorithms execution and determining the number of iterations needed are both NP-hard problems. Finally, we close a long-standing open problem in the area of network flows over time by showing that earliest arrival flows are NP-hard to obtain.

Subquadratic Algorithms for the Diameter and the Sum of Pairwise Distances in Planar Graphs

We show how to compute for n-vertex planar graphs in O(n^{11/6} polylog(n)) expected time the diameter and the sum of the pairwise distances. The algorithms work for directed graphs with real weights and no negative cycles. In O(n^{15/8} polylog(n)) expected time we can also compute the number of pairs of vertices at distance smaller than a given threshold, These are the first algorithms for these problems using time O(n^c) for some constant c < 2, even when restricted to undirected, unweighted planar graphs.

A (2+µ)-Approximation for Maximum Weight Matching in the Semi-Streaming Model

We present a simple deterministic single-pass (2+µ)-approximation algorithm for the maximum weight matching problem in the semi-streaming model. This improves upon the currently best known approximation ratio of (3.5+µ). Our algorithm uses O(n log2 n) space for constant values of µ. It relies on a variation of the local-ratio theorem, which may be of use for other algorithms in the semi-streaming model as well.

An O(log k)-competitive algorithm for Generalized Caching

In the generalized caching problem, we have a set of pages and a cache of size k. Each page p has a size wpe1 and fetching cost cp for loading the page into the cache. At any point in time, the sum of the sizes of the pages stored in the cache cannot exceed k. The input consists of a sequence of page requests. If a page is not present in the cache at the time it is requested, it has to be loaded into the cache incurring a cost of cp. We give a randomized O(log k)-competitive online algorithm for the generalized caching problem, improving the previous bound of O(log2 k) by Bansal, Buchbinder, and Naor (STOC'08). This improved bound is tight and of the same order as the known bounds for the classic problem with uniform weights and sizes. We use the same LP based techniques as Bansal et al. but provide improved and slightly simplified methods for rounding fractional solutions online.

Distribution-free Junta Testing

We study the problem of testing whether an unknown n-variable Boolean function is a k-junta in the distribution-free property testing model, where the distance between functions is measured with respect to an arbitrary and unknown probability distribution over {0,1}n. Our first main result is that distribution-free k-junta testing can be performed, with one-sided error, by an adaptive algorithm that uses Õ(k2)/µ queries (independent of n). Complementing this, our second main result is a lower bound showing that any non-adaptive distribution-free k-junta testing algorithm must make ©(2k/3) queries even to test to accuracy µ = 1/3. These bounds establish that while the optimal query complexity of non-adaptive k-junta testing is 2˜(k), for adaptive testing it is poly(k), and thus show that adaptivity provides an exponential improvement in the distribution-free query complexity of testing juntas.

Clique-width III: Hamiltonian Cycle and the Odd Case of Graph Coloring

Max-Cut, Edge Dominating Set, Graph Coloring, and Hamiltonian Cycle on graphs of bounded clique-width have received significant attention. Each of these problems can be solved in time g(k)n^f(k) on graphs of clique-width k. Fomin et al. showed that the running times cannot be improved to g(k)n^O(1) assuming W[1]\=FPT. However, this does not rule out non-trivial improvements to the exponent f(k) in the running times. In a follow-up paper, the same authors improved the running times for Edge Dominating Set and Max-Cut to n^O(k), and proved that these problems cannot be solved in time g(k)n^o(k) unless ETH fails. Thus, prior to this work, Edge Dominating Set and Max-Cut were known to have tight n^\Theta(k) algorithmic upper and lower bounds. In this paper we provide lower bounds for Hamiltonian Cycle and Graph Coloring. For Hamiltonian Cycle our lower bound g(k)n^o(k) matches asymptotically the recent upper bound n^O(k) due to Bergougnoux, Kante and Kwon [WADS 2017]. As opposed to the asymptotically tight n^\Theta(k) bounds for Edge Dominating Set, Max-Cut and Hamiltonian Cycle, the Graph Coloring problem has an upper bound of n^O(2^k) and a lower bound of merely n^o(\sqrt[4]{k}) (implicit from the W[1]-hardness proof). In this paper, we close the gap for Graph Coloring by proving a lower bound of n^{2^{o(k)}}. This shows that Graph Coloring behaves qualitatively different from the other three problems. To the best of our knowledge, Graph Coloring is the first natural problem known to require exponential dependence on the parameter in the exponent of n.

Deterministic Graph Exploration with Advice

We consider the fundamental task of graph exploration. An $n$-node graph has unlabeled nodes, and all ports at any node of degree $d$ are arbitrarily numbered $0,\dots, d-1$. A mobile agent, initially situated at some starting node $v$, has to visit all nodes and stop. The time of the exploration is the number of edge traversals. We consider the problem of how much knowledge the agent has to have a priori, in order to explore the graph in a given time, using a deterministic algorithm. Following the paradigm of algorithms with advice, this a priori information (advice) is provided to the agent by an oracle, in the form of a binary string, whose length is called the size of advice. We consider two types of oracles. The instance oracle knows the entire instance of the exploration problem, i.e., the port-numbered map of the graph and the starting node of the agent in this map. The map oracle knows the port-numbered map of the graph but does not know the starting node of the agent. The central question of the paper is the following. What is the minimum size of advice that must be given to the agent by each of these oracles, so that the agent explores the graph in a given time?

The complexity of independent set reconfiguration on bipartite graphs

We settle the complexity of the Independent Set Reconfiguration problem on bipartite graphs under all three commonly studied reconfiguration models. We show that under the token jumping or token addition/removal model the problem is NP-complete. For the token sliding model, we show that the problem remains PSPACE-complete.

Entropy and Optimal Compression of Some General Plane Trees

We continue developing the information theory of structured data. In this paper, we take up $d$-ary trees ($d\ge 2$) and trees with unrestricted degree. We first compute the entropy which gives us the fundamental lower bound on compression of such trees. Then we present efficient compression algorithms based on arithmetic encoding that achieve the entropy within a constant number of bits. A na\"ive implementation of these algorithms has a prohibitive time complexity of $O(n^d)$ elementary operations, but our efficient algorithms run in $O(n^2)$ of these operations, where $n$ is the number of nodes. It turns out that extending source coding (i.e., compression) from sequences to advanced data structures such as general trees is mathematically quite challenging and leads to new recurrences that find ample applications in the information theory of general structures (e.g., to analyze the information content of general non-plane trees).

Polynomial Kernels and Wideness Properties of Nowhere Dense Graph Classes

Nowhere dense classes of graphs are very general classes of uniformly sparse graphs with several seemingly unrelated characterisations. From an algorithmic perspective, a characterisation of these classes in terms of uniform quasi-wideness, a concept originating in finite model theory, has proved to be particularly useful. Uniform quasi-wideness is used in many fpt-algorithms on nowhere dense classes. However, the existing constructions showing the equivalence of nowhere denseness and uniform quasi-wideness imply a non-elementary blow up in the parameter dependence of the fpt-algorithms, making them infeasible in practice. As a first main result of this paper, we use tools from logic, in particular from a sub-field of model theory known as stability theory, to establish polynomial bounds for the equivalence of nowhere denseness and uniform quasi-wideness. A powerful method in parameterized complexity theory is to compute a problem kernel in a pre-computation step, that is, to reduce the input instance in polynomial time to a sub-instance of size bounded in the parameter only (independently of the input graph size). Our new tools allow us to obtain for every fixed value of r a polynomial kernel for the distance-r dominating set problem on nowhere dense classes of graphs. This result is particularly interesting, as it implies that for every class C of graphs which is closed under subgraphs, the distance-r dominating set problem admits any kernel on C for every value of r if, and only if, it already admits a polynomial kernel for every value of r (assuming FPT \neq W[2]).

Stream Sampling Framework and Application for Frequency Cap Statistics

Unaggregated data, in a streamed or distributed form, is prevalent and comes from diverse sources such as interactions of users with web services and IP traffic. Data elements have {\em keys} (cookies, users, queries) and elements with different keys interleave. Analytics on such data typically utilizes statistics expressed as a sum over keys in a specified segment of a function $f$ applied to the frequency (the total number of occurrences) of the key. In particular, {\em Distinct} is the number of active keys in the segment, {\em Sum} is the sum of their frequencies, and both are special cases of {\em frequency cap} statistics, which cap the frequency by a parameter T. The number of distinct active keys in the data can be very large, making exact computation of queries costly. Instead, we can estimate these statistics from a sample. An optimal sample for a given function $f$ would include a key with frequency $w$ with probability roughly proportional to $f(w)$. But while such a "gold-standard" sample can be easily computed over the aggregated data (the set of key-frequency pairs), exact aggregation itself is costly, requiring state proportional to the number of active keys. Ideally, we would like to compute a sample without exact aggregation. We present a sampling framework for unaggregated data that uses a single pass (for streams) or two passes (for distributed data) and state equal to the desired sample size and applies to all cap statistics.

Efficient Algorithms for Constructing Very Sparse Spanners and Emulators

Miller et al. \cite{MPVX15} devised a distributed\footnote{They actually showed a PRAM algorithm. The distributed algorithm with these properties is implicit in \cite{MPVX15}.} algorithm in the CONGEST model, that given a parameter $k = 1,2,\ldots$, constructs an $O(k)$-spanner of an input unweighted $n$-vertex graph with $O(n^{1+1/k})$ expected edges in $O(k)$ rounds of communication. In this paper we improve the result of \cite{MPVX15}, by showing a $k$-round distributed algorithm in the same model, that constructs a $(2k-1)$-spanner with $O(n^{1+1/k}/\epsilon)$ edges, with probability $1- \epsilon$, for any $\epsilon>0$. Moreover, when $k = \omega(\log n)$, our algorithm produces (still in $k$ rounds) {\em ultra-sparse} spanners, i.e., spanners of size $n(1+ o(1))$, with probability $1- o(1)$. To our knowledge, this is the first distributed algorithm in the CONGEST or in the PRAM models that constructs spanners or skeletons (i.e., connected spanning subgraphs) that sparse. Our algorithm can also be implemented in linear time in the standard centralized model, and for large $k$, it provides spanners that are sparser than any other spanner given by a known (near-)linear time algorithm. We also devise improved bounds (and algorithms realizing these bounds) for $(1+\epsilon,\beta)$-spanners and emulators. In particular, we show that for any unweighted $n$-vertex graph and any $\epsilon > 0$, there exists a $(1+ \epsilon, ({{\log\log n} \over \epsilon})^{\log\log n})$-emulator with $O(n)$ edges. All previous constructions of $(1+\epsilon,\beta)$-spanners and emulators employ a superlinear number of edges, for all choices of parameters. Finally, we provide some applications of our results to approximate shortest paths' computation in unweighted graphs.

An Optimal Algorithm for l_1-Heavy Hitters in Insertion Streams and Related Problems

We give the first optimal bounds for returning the $\ell_1$-heavy hitters in a data stream of insertions, together with their approximate frequencies, closing a long line of work on this problem. For a stream of $m$ items in $\{1, 2, \ldots, n\}$ and parameters $0 < \epsilon < \phi \leq 1$, let $f_i$ denote the frequency of item $i$, i.e., the number of times item $i$ occurs in the stream. With arbitrarily large constant probability, our algorithm returns all items $i$ for which $f_i \geq \phi m$, returns no items $j$ for which $f_j \leq (\phi -\epsilon)m$, and returns approximations $\tilde{f}_i$ with $|\tilde{f}_i - f_i| \leq \epsilon m$ for each item $i$ that it returns. Our algorithm uses $O(\epsilon^{-1} \log\phi^{-1} + \phi^{-1} \log n + \log \log m)$ bits of space, processes each stream update in $O(1)$ worst-case time, and can report its output in time linear in the output size. We also prove a lower bound, which implies that our algorithm is optimal up to a constant factor in its space complexity. A modification of our algorithm can be used to estimate the maximum frequency up to an additive $\epsilon m$ error in the above amount of space, resolving Question 3 in the IITK 2006 Workshop on Algorithms for Data Streams for the case of $\ell_1$-heavy hitters. We also introduce several variants of the heavy hitters problem, inspired by rank aggregation and voting schemes, and show how our techniques can be applied in such settings.

Even Delta-Matroids and the Complexity of Planar Boolean CSPs

The main result of this paper is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even ”-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by DvoYák and Kupec. Knowing that edge CSP is tractable for even ”-matroid constraints allows us to extend the tractability result to a larger class of ”-matroids that includes many classes that were known to be tractable before, namely co-independent, compact, local and binary.

All ACM Journals | See Full Journal Index

Search TALG
enter search term and/or author name