Dependent rounding is a useful technique for optimization problems with hard budget constraints. This framework naturally leads to \emph{negative correlation} properties. However, what if an application naturally calls for dependent rounding on the one hand, and desires \emph{positive} correlation on the other? More generally, we develop algorithms that guarantee the known properties of dependent rounding, but also have nearly best-possible behavior -- near-independence, which generalizes positive correlation -- on ``small" subsets of the variables. The recent breakthrough of Li \& Svensson for the classical $k$-median problem has to handle positive correlation in certain dependent-rounding settings, and does so implicitly. We improve upon Li-Svensson's approximation ratio for $k$-median from $2.732 + \epsilon$ to $2.611 + \epsilon$ by developing an algorithm that improves upon various aspects of their work. Our dependent-rounding approach helps us improve the dependence of the runtime on the parameter $\epsilon$ from Li-Svensson's $N^{O(1/\epsilon^2)}$ to $N^{O((1/\epsilon) \log(1/\epsilon))}$.

In this paper, we study the uniform capacitated k-median problem. In the problem, we are given a set F of potential facility locations, a set C of clients, a metric d over F \cup C, an upper bound k on the number of facilities we can open and an upper bound u on the number of clients each facility can serve. We need to open a subset S \subseteq F of k facilities and connect clients in C to facilities in S so that each facility is connected by at most u clients. The goal is to minimize the total connection cost over all clients. Obtaining a constant approximation algorithm for this problem is a notorious open problem; most previous works gave constant approximations by either violating the capacity constraints or the cardinality constraint. Notably, all these algorithms are based on the natural LP-relaxation for the problem. The LP-relaxation has unbounded integrality gap, even when we are allowed to violate the capacity constraints or the cardinality constraint by a factor of 2-\eps. Our result is an \exp(O(1/\eps^2))-approximation algorithm for the problem that violates the cardinality constraint by a factor of 1+\eps. This is already beyond the capability of the natural LP relaxation, as it has unbounded integrality gap even if we are allowed to open (2-\eps)k facilities. Indeed, our result is based on a novel LP for this problem. We hope that this LP is the first step towards a constant approximation for capacitated k-median.

We consider the problem of, given an array $A[1,n]$ of elements with a total order, building a data structure that solves two queries: $(a)$ selection queries receive a range $[i,j]$ and an integer $k$ and return the position of the $k$th largest element in $A[i,j]$; $(b)$ top-$k$ queries receive $[i,j]$ and $k$ and return the positions of the $k$ largest elements in $A[i,j]$. These problems can be solved in optimal time, $O(1+\log k/\log\log n)$ and $O(k)$ respectively, using linear-space data structures. We present the first study of {\em encoding} data structures for this problem, which do not access $A$ at query time. These save storage space in applications where the values of $A$ themselves are not of interest. We first show that any encoding answering such queries requires $n\lg k - O(n+k \lg k)$ bits of space. Then we design encodings using $O(n\log k)$ bits, that is, asymptotically optimal up to constant factors, that answer selection and top-$k$ queries in optimal time.

This paper addresses the problem of designing a {\em fault-tolerant} $(\alpha, \beta)$ approximate BFS structure (or {\em FT-ABFS structure} for short), namely, a subgraph $H$ of the network $G$ such that subsequent to the failure of some subset $F$ of edges or vertices, the surviving part %$H'$ of $H$ still contains an \emph{approximate} BFS spanning tree for (the surviving part of) $G$, satisfying $\dist(s,v,H\setminus F) \leq \alpha \cdot \dist(s,v,G\setminus F)+\beta$ for every $v \in V$. In SODA'14, we considered {\em multiplicative} $(\alpha,0)$ FT-ABFS structures resilient to a failure of a \emph{single} edge and presented an algorithm that given an $n$-vertex unweighted undirected graph $G$ and a source $s$ constructs a $(3,0)$ FT-ABFS structure rooted at $s$ with at most $4n$ edges (improving by an $O(\log n)$ factor on the near-tight result of \cite{BS10} for the special case of edge failures). Assuming at most $f$ edge failures, for constant integer $f>1$, we prove that there exists a (poly-time constructible) $(3(f+1), (f+1) \log n)$ FT-ABFS structure with $O(f n)$ edges. %%resilient to the failure of up to $f$ edges. We then consider {\em additive} $(1,\beta)$ FT-ABFS structures and demonstrate an interesting dichotomy between multiplicative and additive spanners. In contrast to the linear size of $(\alpha,0)$ FT-ABFS structures, we show that for every $\beta \in [1, O(\log n)]$ there exists an $n$-vertex graph $G$ with a source $s$ for which any $(1,\beta)$ FT-ABFS structure rooted at $s$ has $\Omega(n^{1+\epsilon(\beta)})$ edges, for some function $\epsilon(\beta) \in (0,1)$.

Even though local search heuristics are the method of choice in practice for many well-studied optimization problems, most of them behave poorly in the worst case. This is in particular the case for the Maximum-Cut Problem, for which local search can take an exponential number of steps to terminate and the problem of computing a local optimum is PLS-complete. To narrow the gap between theory and practice, we study local search for the Maximum-Cut Problem in the framework of smoothed analysis in which inputs are subject to a small amount of random noise. We show that the smoothed number of iterations is quasi-polynomial, i.e., it is bounded from above by a polynomial in n^log(n) and ¦ where n denotes the number of nodes and ¦ denotes the perturbation parameter. This shows that worst-case instances are fragile and it is a first step in explaining why they are rarely observed in practice.

The Lopsided Lovasz Local Lemma (LLLL) is a powerful principle in combinatorics and probability. While it began as a general statement about probability spaces, it has recently been transformed into a variety of polynomial-time algorithms. The resampling algorithm of Moser & Tardos is the most well-known example of this. A variety of criteria have been shown for the LLLL; the strongest possible criterion was shown by Shearer. We show a new criterion for the Moser-Tardos algorithm to converge. This criterion is stronger than the LLLL criterion, and in fact can yield better results even than the full Shearer criterion. This is possible because it does not apply in the same generality as the original LLLL; yet, it is strong enough to cover many applications of the LLLL in combinatorics. A noteworthy application is for $k$-SAT, with bounded occurrences of variables. As shown in Gebauer, Szabo, and Tardos, a $k$-SAT instance in which every variable appears $L \leq \frac{2^{k+1}}{e (k+1)}$ times, is satisfiable. Although this bound is asymptotically tight (in $k$), we improve it to $L \leq \frac{2^{k+1} (1 - 1/k)^k}{k-1} - \frac{2}{k}$ which can be significantly stronger when $k$ is small. We introduce a new parallel algorithm for the LLLL. While Moser & Tardos described a simple parallel algorithm for the Lovasz Local Lemma, and described a simple sequential algorithm for a form of the Lopsided Lemma, they were not able to combine the two.

We propose a new approach to competitive analysis in online scheduling by introducing the novel concept of competitive-ratio approximation schemes. Such a scheme algorithmically constructs an online algorithm with a competitive ratio arbitrarily close to the best possible competitive ratio for any online algorithm. We study the problem of scheduling jobs online to minimize the weighted sum of completion times on parallel, related, and unrelated machines, and we derive both deterministic and randomized algorithms which are almost best possible among all online algorithms of the respective settings. We also generalize our techniques to arbitrary monomial cost functions and apply them to the makespan objective. Our method relies on an abstract characterization of online algorithms combined with various simplifications and transformations. We also contribute algorithmic means to compute the actual value of the best possible competitive ratio up to an arbitrary accuracy. This strongly contrasts nearly all previous manually obtained competitiveness results and, most importantly, it reduces the search for the optimal competitive ratio to a question that a computer can answer. We believe that our concept can also be applied to many other problems and yields a new perspective on online algorithms in general.

This paper addresses the general limit of the standard tractability
results for Constraint Satisfaction Problems (CSP) and counting-CSP, that they only apply to instances
where all constraints belong to a single tractable language. We
show that we can overcome this limitation as long as we keep some
control of how constraints over the various considered tractable
languages interact with each other. For this purpose we utilize the
notion of a strong backdoor of a CSP instance, as introduced
by Williams et al. (IJCAI 2003), which is a set of variables that
when instantiated moves the instance to an island of tractability,
i.e., to a tractable class of instances. Our main
result is an algorithm that, given a CSP instance with n
variables, finds in time f(k)n^{O(1)} a strong backdoor into
a scattered class (associated with a list of finite conservative
constraint languages) of size k or correctly decides that there
is no such backdoor.
This also gives the running time for solving #CSP, provided that
#CSP is polynomial-time tractable for the considered
constraint languages. Our result makes significant progress towards the main goal of the backdoor-based approach to CSPs -- the identification of maximal base classes for which small backdoors can be detected efficiently.

Several simple, classical, little-known algorithms in the statistical literature for generating random permutations by coin-tossing are examined, analyzed and implemented. These algorithms are either asymptotically optimal or close to being so in terms of the expected number of times the random bits are generated. In addition to asymptotic approximations to the expected complexity, we also clarify the corresponding variances, as well as the asymptotic distributions. A brief comparative discussion with numerical computations in a multicore system is also given.

The dynamic facility location problem is a generalization of the classic facility location problem proposed by Eisenstat, Mathieu, and Schabanel to model the dynamics of evolving social/infrastructure networks. The generalization lies in that the distance metric between clients and facilities changes over time. This leads to a trade-off between optimizing the classic objective function and the "stability" of the solution: there is a switching cost charged every time a client changes the facility to which it is connected. While the standard linear program (LP) relaxation for the classic problem naturally extends to this problem, traditional LP-rounding techniques do not, as they are often sensitive to small changes in the metric resulting in frequent switches. We present a new LP-rounding algorithm for facility location problems, which yields the first constant approximation algorithm for the dynamic facility location problem. Our algorithm installs competing exponential clocks on the clients and facilities, and connect every client by the path that repeatedly follows the smallest clock in the neighborhood. The use of exponential clocks gives rise to several properties that distinguish our approach from previous LP-roundings for facility location problems. In particular, we use no clustering and we allow clients to connect through paths of arbitrary lengths. In fact, the clustering-free nature of our algorithm is crucial for applying our LP-rounding approach to the dynamic problem.