ACM Transactions on

Algorithms (TALG)

Latest Articles

Distribution-free Junta Testing

We study the problem of testing whether an unknown n-variable Boolean function is a k-junta in the distribution-free property testing model, where the distance between functions is measured with respect to an arbitrary and unknown probability distribution over {0,1}n. Our first main result is that distribution-free k-junta testing can be... (more)

An Optimal Algorithm for ℓ1-Heavy Hitters in Insertion Streams and Related Problems

We give the first optimal bounds for returning the ℓ1-heavy hitters in a data stream of... (more)

Entropy and Optimal Compression of Some General Plane Trees

We continue developing the information theory of structured data. In this article, we study models generating d-ary trees (d ≥ 2) and trees with... (more)

Efficient Algorithms for Constructing Very Sparse Spanners and Emulators

Miller et al. [48] devised a distributed1 algorithm in the CONGEST model that, given a parameter k = 1,2,&ldots;, constructs an O(k)-spanner of... (more)

The Simplex Algorithm Is NP-Mighty

We show that the Simplex Method, the Network Simplex Method—both with Dantzig’s original pivot rule—and the Successive Shortest Path Algorithm are NP-mighty. That is, each of these algorithms can be used to solve, with polynomial overhead, any problem in NP implicitly during the algorithm’s execution. This... (more)

An O(log k)-Competitive Algorithm for Generalized Caching

In the generalized caching problem, we have a set of pages and a cache of size k. Each page p has a size wp≥ 1 and fetching cost cp for loading the... (more)

The Complexity of Independent Set Reconfiguration on Bipartite Graphs

We settle the complexity of the Independent Set Reconfiguration problem on bipartite graphs under all three commonly studied reconfiguration models.... (more)

Deterministic Graph Exploration with Advice

We consider the fundamental task of graph exploration. An n-node graph has unlabeled nodes, and all ports at any node of degree d are arbitrarily numbered 0,…, d−1. A mobile agent, initially situated at some starting node v, has to visit all nodes and stop. The time of the exploration is the number of edge traversals. We consider the... (more)

Clique-width III: Hamiltonian Cycle and the Odd Case of Graph Coloring

MAX-CUT, EDGE DOMINATING SET, GRAPH COLORING, and HAMILTONIAN CYCLE on graphs of bounded clique-width have received significant attention as they can be formulated in MSO2 (and, therefore, have linear-time algorithms on bounded treewidth graphs by the celebrated Courcelle’s theorem), but cannot be formulated in MSO1 (which would have... (more)

Improved Analysis of Deterministic Load-Balancing Schemes

We consider the problem of deterministic load balancing of tokens in the discrete model. A set of n processors is connected into a d-regular... (more)

Feedback Vertex Set Inspired Kernel for Chordal Vertex Deletion

Given a graph G and a parameter k, the Chordal Vertex Deletion (CVD) problem asks whether there exists a subset U⊆ V(G) of size at most k that... (more)

Common Tangents of Two Disjoint Polygons in Linear Time and Constant Workspace

We provide a remarkably simple algorithm to compute all (at most four) common tangents of two disjoint simple polygons. Given each polygon as a... (more)


In Memoriam: David S. Johnson

About TALG

The ACM Transactions on Algorithms (TALG) publishes original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis.

read more
Locating Errors in Faulty Formulas

Given a drawing of a read-once formula (called the blueprint), and a blackbox implementation with the same topology as the blueprint, that purports to compute the formula, can we tell if it does? Under a fault model, where the only faults in the implementation are gates that complement their outputs, we show that there is an efficient algorithm that makes a linear number of probes to the blackbox implementation and determines if the blueprint and implementation are identical. We also show a matching lower bound. We further ask whether we can diagnose where the faults are, using black-box testing. We prove that if the implementation has a property called {\em polynomial balance} it is possible to do this efficiently. To complement this result, we show that even if the \textit{blueprint} is polynomially balanced, and there are only logarithmically many errors in the implementation, the implementation could be unbalanced and the diagnosis problem provably requires super-polynomially many tests. We point out that this problem is one instance of a general class of problems of learning deviations from a blueprint, which we call {\em conformance learning}. Conformance learning seems worthy of further investigation in a broader context.

Approximation Schemes for Clustering with Outliers

We study clustering problems with outliers. More specifically, we look at \ufl (UFL), \kmed, and \kmeans. In these problems, we are given a set $\cX$ of data points in a metric space $\delta(.,.)$, a set $\fa$ of possible centres (each maybe with an opening cost), maybe an integer parameter $k$, plus an additional parameter $z$ as the number of outliers. In \ufl with outliers, we have to open some centres, discard up to $z$ points of $\cX$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In \kmed and \kmeans, we have to open up to $k$ centres but there are no opening costs. In \kmeans, the cost of assigning $j$ to $i$ is $\delta^2(j,i)$. We present several results. Our main focus is on cases where $\delta$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For \uufl with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the \kmed and \kmeans problems in the same metrics where, for any constant $\epsilon > 0$, we can find a solution using $(1+\epsilon)k$ centres whose cost is at most a $(1+\epsilon)$-factor of the optimum and uses at most $z$ outliers.

Approximation schemes for machine scheduling with resource (in-)dependent processing times

We consider two related scheduling problems: single resource constrained scheduling on identical parallel machines and a generalization with resource dependent processing times. In both problems, jobs require a certain amount of an additional resource and have to be scheduled on machines minimizing the makespan, while at every point in time a given resource capacity is not exceeded. In the first variant of the problem the processing times and resource amounts are fixed, while in the second the former depends on the latter. Both problems contain bin packing with cardinality constraint as a special case, and, therefore, these problems are strongly NP-complete even for a constant number of machines larger than three, which can be proven by a reduction from 3-Partition. Furthermore, if the number of machines is part of the input, we can not hope for an approximation algorithm with absolute approximation ratio smaller than 3/2. We present asymptotic fully polynomial time approximation schemes (AFPTAS) for the problems: For any µ > 0 a schedule of length at most (1+µ) times the optimum plus an additive term of O(log(1/µ)/µ) is provided, and the running time is polynomially bounded in 1/µ and the input length. Up to now only approximation algorithms with absolute approximation ratios were known. Furthermore, the AFPTAS for resource constrained scheduling on identical parallel machines directly improves the additive term of the best AFPTAS for bin packing with cardinality constraint so far.

A Lottery Model for Center-type Problems With Outliers

In this paper, we give tight approximation algorithms for the $k$-center and matroid center problems with outliers. Unfairness arises naturally in this setting: certain clients could always be considered as outliers. To address this issue, we introduce a lottery model in which each client $j$ is allowed to submit a parameter $p_j \in [0,1]$ and we look for a random solution that covers every client $j$ with probability at least $p_j$. Our techniques include a randomized rounding procedure to round a point inside a matroid intersection polytope to a basis plus at most one extra item such that all marginal probabilities are preserved and such that a certain linear function of the variables does not decrease in the process with probability one.

Optimal streaming and tracking distinct elements with high probability

The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements in the input. After a long line of research optimal solution for this problem with constant probability of success, using $\mathcal{O}(\frac{1}{\varepsilon^2}+\log n)$ bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability $\delta$, is to take a median of $\log \delta^{-1}$ parallel repetitions of the original algorithm and report the median of computed answers. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using $\mathcal{O}(\frac{\log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space --- matching known lower bounds for this problem. That is, the $\log\delta^{-1}$ factor does not multiply the $\log n$ term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also \emph{strong tracking} (or \emph{continuous monitoring}) variant of the distinct elements problem, where we want an algorithm which provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using $\mathcal{O}(\frac{\log \log n + \log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space, which we show to be optimal.

Bloom Filters in Adversarial Environments

Many efficient data structures use randomness, allowing them to improve upon deterministic ones. Usually, their efficiency and/or correctness are analyzed using probabilistic tools under the assumption that the inputs and queries are independent of the internal randomness of the data structure. In this work, we consider data structures in a more robust model, which we call the adversarial model. Roughly speaking, this model allows an adversary to choose inputs and queries adaptively according to previous responses. Specifically, we consider a data structure known as "Bloom filter" and prove a tight connection between Bloom filters in this model and cryptography. A Bloom filter represents a set S of elements approximately, by using fewer bits than a precise representation. The price for succinctness is allowing some errors: for any x in S it should always answer `Yes', and for any x not in S it should answer 'Yes' only with small probability. In the adversarial model, we consider both efficient adversaries (that run in polynomial time) and computationally unbounded adversaries that are only bounded in the amount of queries they can make. For computationally bounded adversaries, we show that non-trivial (memory-wise) Bloom filters exist if and only if one-way functions exist. For unbounded adversaries we show that there exists a Bloom filter for sets of size n and error eps, that is secure against t queries and uses only O(n*log(1/eps)+t) bits of memory. In comparison, n*log(1/eps) is the best possible under a non-adaptive adversary.

Online Submodular Maximization with Preemption

Submodular function maximization has been studied extensively in recent years under various constraints and models. The problem plays a major role in various disciplines. We study a natural online variant of this problem in which elements arrive one-by-one and the algorithm has to maintain a solution obeying certain constraints at all times. Upon arrival of an element, the algorithm has to decide whether to accept the element into its solution and may preempt previously chosen elements. The goal is to maximize a submodular function over the set of elements in the solution. We study two special cases of this general problem and derive upper and lower bounds on the competitive ratio. Specifically, we design a 1/e-competitive algorithm for the unconstrained case in which the algorithm may hold any subset of the elements, and constant competitive ratio algorithms for the case where the algorithm may hold at most k elements in its solution.

Optimal Hashing--Based Time--Space Trade-offs for Approximate Near Neighbors

We show tight upper and lower bounds for time--space trade-offs for the c-Approximate Near Neighbor Search problem. For the d-dimensional Euclidean space and n-point datasets, we develop a data structure with space n1+Áu+o(1)+O(dn) and query time nÁq+o(1)+dno(1) for every Áu, Áqe0 with: c2q + (c2 - 1) Áu = (2c2-1). For example, for the approximation c=2 we achieve: Ï Space n1.77... and query time no(1), significantly improving upon known data structures that support very fast queries; Ï Space n1.14... and query time n0.14..., matching the optimal data-dependent Locality--Sensitive Hashing (LSH); Ï Space n1+o(1) and query time n0.43..., making significant progress in the regime of near-linear space, which is arguably of the most interest for practice. This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor, improving upon [Kap15]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on spherical Locality-Sensitive Filtering [BDGL16] and data-dependent hashing [AR15]. Our matching lower bounds are of two types: conditional and unconditional. We prove tightness of the whole trade-off in a restricted model of computation. We show unconditional cell-probe lower bounds for one and two probes that match the trade-off for $\rho_q = 0$. This is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the corresponding one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.

P-FPT algorithms for bounded clique-width graphs

Recently, hardness results for problems in P were achieved using reasonable complexity theoretic assumptions such as the Strong Exponential Time Hypothesis. According to these assumptions, many graph theoretic problems do not admit truly subquadratic algorithms. A central technique used to tackle the difficulty of the above mentioned problems is fixed-parameter algorithms with {\em polynomial dependency} in the fixed parameter (P-FPT). Applying this technique to {\em clique-width}, an important graph parameter, remained to be done. In this paper we study several graph theoretic problems for which hardness results exist such as {\em cycle problems}, {\em distance problems} and {\em maximum matching}. We give hardness results and P-FPT algorithms, using clique-width and some of its upper-bounds as parameters. We believe that our most important result is an $\mathcal{O}(k^4 \cdot n + m)$-time algorithm for computing a maximum matching where $k$ is either the modular-width or the $P_4$-sparseness. The latter generalizes many algorithms that have been introduced so far for specific subclasses such as cographs. Our algorithms are based on preprocessing methods using modular decomposition and split decomposition. Thus they can also be generalized to some graph classes with unbounded clique-width.

A new algorithm for fast generalized DFTs

We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{\omega/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $\omega$ is the exponent of matrix multiplication, so the exponent $\omega/2$ is optimal if $\omega = 2$. Previously, ``exponent one'' algorithms were known for supersolvable groups and the symmetric and alternating groups. No exponent one algorithms were known, even under the assumption $\omega = 2$, for families of linear groups of fixed dimension, and indeed the previous best-known algorithm for $\SL_2(\F_q)$ had exponent $4/3$ despite being the focus of significant effort. We unconditionally achieve exponent at most $1.19$ for this group, and exponent one if $\omega = 2$. Our algorithm also yields an improved exponent for computing the generalized DFT over general finite groups $G$, which beats the longstanding previous best upper bound, for any $\omega$. In particular, assuming $\omega = 2$, we achieve exponent $\sqrt{2}$, while the previous best was $3/2$.

Sparse Approximation via Generating Point Sets

For a set P of n points in the unit ball b†Rd, consider the problem of finding a small subset T†P such that its convex-hull µ-approximates the convex-hull of the original set. We present an efficient algorithm to compute such a µ2-approximation of size kalg, where µ2 is function of µ, and kalg is a function of the minimum size kopt of such an µ-approximation. Surprisingly, there is no dependency on the dimension d in both bounds. Furthermore, every point of P can be µ-approximated by a convex-combination of points of T that is O(1/µ2)-sparse. Our result can be viewed as a method for sparse, convex autoencoding: approximately representing the data in a compact way using sparse combinations of a small subset T of the original data. The new algorithm can be kernelized, and it preserves sparsity in the original input.

Domination when the stars are out

We algorithmize the structural characterization for claw-free graphs by Chudnovsky and Seymour. Building on this result, we show that Dominating Set on claw-free graphs is (i) fixed-parameter tractable and (ii) even possesses a polynomial kernel. To complement these results, we establish that Dominating Set is unlikely to be fixed-parameter tractable on the slightly larger class of graphs that exclude K_1,4 as an induced subgraph (K_1,4-free graphs). We show that our algorithmization can also be used to show that the related Connected Dominating Set problem is fixed-parameter tractable on claw-free graphs. To complement that result, we show that Connected Dominating Set is unlikely to have a polynomial kernel on claw-free graphs and is unlikely to be fixed-parameter tractable on K_1,4-free graphs. Combined, our results provide a dichotomy for Dominating Set and Connected Dominating Set on K_1,L-free graphs and show that the problem is fixed-parameter tractable if and only if L <= 3.

All ACM Journals | See Full Journal Index

Search TALG
enter search term and/or author name