Given a drawing of a read-once formula (called the blueprint), and a blackbox implementation with the same topology as the blueprint, that purports to compute the formula, can we tell if it does? Under a fault model, where the only faults in the implementation are gates that complement their outputs, we show that there is an efficient algorithm that makes a linear number of probes to the blackbox implementation and determines if the blueprint and implementation are identical. We also show a matching lower bound. We further ask whether we can diagnose where the faults are, using black-box testing. We prove that if the implementation has a property called {\em polynomial balance} it is possible to do this efficiently. To complement this result, we show that even if the \textit{blueprint} is polynomially balanced, and there are only logarithmically many errors in the implementation, the implementation could be unbalanced and the diagnosis problem provably requires super-polynomially many tests. We point out that this problem is one instance of a general class of problems of learning deviations from a blueprint, which we call {\em conformance learning}. Conformance learning seems worthy of further investigation in a broader context.

We study clustering problems with outliers. More specifically, we look at \ufl (UFL), \kmed, and \kmeans. In these problems, we are given a set $\cX$ of data points in a metric space $\delta(.,.)$, a set $\fa$ of possible centres (each maybe with an opening cost), maybe an integer parameter $k$, plus an additional parameter $z$ as the number of outliers. In \ufl with outliers, we have to open some centres, discard up to $z$ points of $\cX$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In \kmed and \kmeans, we have to open up to $k$ centres but there are no opening costs. In \kmeans, the cost of assigning $j$ to $i$ is $\delta^2(j,i)$. We present several results. Our main focus is on cases where $\delta$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For \uufl with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the \kmed and \kmeans problems in the same metrics where, for any constant $\epsilon > 0$, we can find a solution using $(1+\epsilon)k$ centres whose cost is at most a $(1+\epsilon)$-factor of the optimum and uses at most $z$ outliers.

We consider two related scheduling problems: single resource constrained scheduling on identical parallel machines and a generalization with resource dependent processing times. In both problems, jobs require a certain amount of an additional resource and have to be scheduled on machines minimizing the makespan, while at every point in time a given resource capacity is not exceeded. In the first variant of the problem the processing times and resource amounts are fixed, while in the second the former depends on the latter. Both problems contain bin packing with cardinality constraint as a special case, and, therefore, these problems are strongly NP-complete even for a constant number of machines larger than three, which can be proven by a reduction from 3-Partition. Furthermore, if the number of machines is part of the input, we can not hope for an approximation algorithm with absolute approximation ratio smaller than 3/2. We present asymptotic fully polynomial time approximation schemes (AFPTAS) for the problems: For any µ > 0 a schedule of length at most (1+µ) times the optimum plus an additive term of O(log(1/µ)/µ) is provided, and the running time is polynomially bounded in 1/µ and the input length. Up to now only approximation algorithms with absolute approximation ratios were known. Furthermore, the AFPTAS for resource constrained scheduling on identical parallel machines directly improves the additive term of the best AFPTAS for bin packing with cardinality constraint so far.

In this paper, we give tight approximation algorithms for the $k$-center and matroid center problems with outliers. Unfairness arises naturally in this setting: certain clients could always be considered as outliers. To address this issue, we introduce a lottery model in which each client $j$ is allowed to submit a parameter $p_j \in [0,1]$ and we look for a random solution that covers every client $j$ with probability at least $p_j$. Our techniques include a randomized rounding procedure to round a point inside a matroid intersection polytope to a basis plus at most one extra item such that all marginal probabilities are preserved and such that a certain linear function of the variables does not decrease in the process with probability one.

The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements in the input. After a long line of research optimal solution for this problem with constant probability of success, using $\mathcal{O}(\frac{1}{\varepsilon^2}+\log n)$ bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability $\delta$, is to take a median of $\log \delta^{-1}$ parallel repetitions of the original algorithm and report the median of computed answers. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using $\mathcal{O}(\frac{\log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space --- matching known lower bounds for this problem. That is, the $\log\delta^{-1}$ factor does not multiply the $\log n$ term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also \emph{strong tracking} (or \emph{continuous monitoring}) variant of the distinct elements problem, where we want an algorithm which provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using $\mathcal{O}(\frac{\log \log n + \log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space, which we show to be optimal.

Many efficient data structures use randomness, allowing them to improve upon deterministic ones. Usually, their efficiency and/or correctness are analyzed using probabilistic tools under the assumption that the inputs and queries are independent of the internal randomness of the data structure. In this work, we consider data structures in a more robust model, which we call the adversarial model. Roughly speaking, this model allows an adversary to choose inputs and queries adaptively according to previous responses. Specifically, we consider a data structure known as "Bloom filter" and prove a tight connection between Bloom filters in this model and cryptography. A Bloom filter represents a set S of elements approximately, by using fewer bits than a precise representation. The price for succinctness is allowing some errors: for any x in S it should always answer `Yes', and for any x not in S it should answer 'Yes' only with small probability. In the adversarial model, we consider both efficient adversaries (that run in polynomial time) and computationally unbounded adversaries that are only bounded in the amount of queries they can make. For computationally bounded adversaries, we show that non-trivial (memory-wise) Bloom filters exist if and only if one-way functions exist. For unbounded adversaries we show that there exists a Bloom filter for sets of size n and error eps, that is secure against t queries and uses only O(n*log(1/eps)+t) bits of memory. In comparison, n*log(1/eps) is the best possible under a non-adaptive adversary.

Submodular function maximization has been studied extensively in recent years under various constraints and models. The problem plays a major role in various disciplines. We study a natural online variant of this problem in which elements arrive one-by-one and the algorithm has to maintain a solution obeying certain constraints at all times. Upon arrival of an element, the algorithm has to decide whether to accept the element into its solution and may preempt previously chosen elements. The goal is to maximize a submodular function over the set of elements in the solution. We study two special cases of this general problem and derive upper and lower bounds on the competitive ratio. Specifically, we design a 1/e-competitive algorithm for the unconstrained case in which the algorithm may hold any subset of the elements, and constant competitive ratio algorithms for the case where the algorithm may hold at most k elements in its solution.

We show tight upper and lower bounds for time--space trade-offs for the *c*-Approximate Near Neighbor Search problem.
For the *d*-dimensional Euclidean space and *n*-point datasets, we develop a data structure with space *n ^{1+Áu+o(1)}+O(dn)* and query time

*n*for every

^{Áq+o(1)}+dn^{o(1)}*Á*with:

_{u}, Á_{q}e0*c*For example, for the approximation

^{2}Á_{q}+ (c^{2}- 1) Á_{u}= (2c^{2}-1).*c=2*we achieve: Ï Space

*n*and query time

^{1.77...}*n*, significantly improving upon known data structures that support very fast queries; Ï Space

^{o(1)}*n*and query time

^{1.14...}*n*, matching the optimal data-dependent Locality--Sensitive Hashing (LSH); Ï Space

^{0.14...}*n*and query time

^{1+o(1)}*n*, making significant progress in the regime of near-linear space, which is arguably of the most interest for practice. This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor, improving upon [Kap15]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on spherical Locality-Sensitive Filtering [BDGL16] and data-dependent hashing [AR15]. Our matching lower bounds are of two types: conditional and unconditional. We prove tightness of the whole trade-off in a restricted model of computation. We show unconditional cell-probe lower bounds for one and two probes that match the trade-off for $\rho_q = 0$. This is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the corresponding one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.

^{0.43...}Recently, hardness results for problems in P were achieved using reasonable complexity theoretic assumptions such as the Strong Exponential Time Hypothesis. According to these assumptions, many graph theoretic problems do not admit truly subquadratic algorithms. A central technique used to tackle the difficulty of the above mentioned problems is fixed-parameter algorithms with {\em polynomial dependency} in the fixed parameter (P-FPT). Applying this technique to {\em clique-width}, an important graph parameter, remained to be done. In this paper we study several graph theoretic problems for which hardness results exist such as {\em cycle problems}, {\em distance problems} and {\em maximum matching}. We give hardness results and P-FPT algorithms, using clique-width and some of its upper-bounds as parameters. We believe that our most important result is an $\mathcal{O}(k^4 \cdot n + m)$-time algorithm for computing a maximum matching where $k$ is either the modular-width or the $P_4$-sparseness. The latter generalizes many algorithms that have been introduced so far for specific subclasses such as cographs. Our algorithms are based on preprocessing methods using modular decomposition and split decomposition. Thus they can also be generalized to some graph classes with unbounded clique-width.

We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{\omega/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $\omega$ is the exponent of matrix multiplication, so the exponent $\omega/2$ is optimal if $\omega = 2$. Previously, ``exponent one'' algorithms were known for supersolvable groups and the symmetric and alternating groups. No exponent one algorithms were known, even under the assumption $\omega = 2$, for families of linear groups of fixed dimension, and indeed the previous best-known algorithm for $\SL_2(\F_q)$ had exponent $4/3$ despite being the focus of significant effort. We unconditionally achieve exponent at most $1.19$ for this group, and exponent one if $\omega = 2$. Our algorithm also yields an improved exponent for computing the generalized DFT over general finite groups $G$, which beats the longstanding previous best upper bound, for any $\omega$. In particular, assuming $\omega = 2$, we achieve exponent $\sqrt{2}$, while the previous best was $3/2$.

For a set P of n points in the unit ball bRd, consider the problem of finding a small subset TP such that its convex-hull µ-approximates the convex-hull of the original set. We present an efficient algorithm to compute such a µ2-approximation of size kalg, where µ2 is function of µ, and kalg is a function of the minimum size kopt of such an µ-approximation. Surprisingly, there is no dependency on the dimension d in both bounds. Furthermore, every point of P can be µ-approximated by a convex-combination of points of T that is O(1/µ2)-sparse. Our result can be viewed as a method for sparse, convex autoencoding: approximately representing the data in a compact way using sparse combinations of a small subset T of the original data. The new algorithm can be kernelized, and it preserves sparsity in the original input.

We algorithmize the structural characterization for claw-free graphs by Chudnovsky and Seymour. Building on this result, we show that Dominating Set on claw-free graphs is (i) fixed-parameter tractable and (ii) even possesses a polynomial kernel. To complement these results, we establish that Dominating Set is unlikely to be fixed-parameter tractable on the slightly larger class of graphs that exclude K_1,4 as an induced subgraph (K_1,4-free graphs). We show that our algorithmization can also be used to show that the related Connected Dominating Set problem is fixed-parameter tractable on claw-free graphs. To complement that result, we show that Connected Dominating Set is unlikely to have a polynomial kernel on claw-free graphs and is unlikely to be fixed-parameter tractable on K_1,4-free graphs. Combined, our results provide a dichotomy for Dominating Set and Connected Dominating Set on K_1,L-free graphs and show that the problem is fixed-parameter tractable if and only if L <= 3.