Nearest-neighbor (NN) search, which returns the nearest neighbor of a query point in a set of points, is an important and widely studied problem in many fields, and it has wide range of applications. In many of them, such as sensor databases, location-based services, face recognition, and mobile data, the location of data is imprecise. We therefore study nearest neighbor queries in a probabilistic framework in which the location of each input point is specified as a probability density function. We present efficient algorithms for (i) computing all points that are nearest neighbors of a query point with nonzero probability; (ii) estimating, within a specified additive error, the probability of a point being the nearest neighbor of a query point; (iii) using it to return the point that maximizes the probability being the nearest neighbor, or all the points with probabilities greater than some threshold to be the NN.

We introduce the $st$-cut version the Sparsest-Cut problem, where the goal is to find a cut of minimum sparsity among those separating two distinguished vertices $s,t\in V$. Clearly, this problem is at least as hard as the usual (non-$st$) version. Our main result is a polynomial-time algorithm for the product-demands setting, that produces a cut of sparsity $O(\sqrt{\OPT})$, where $\OPT$ denotes the optimum, and the total edge capacity and the total demand are assumed (by normalization) to be $1$. Our result generalizes the recent work of Trevisan [arXiv, 2013] for the non-$st$ version of the same problem (Sparsest-Cut with product demands), which in turn generalizes the bound achieved by the discrete Cheeger inequality, a cornerstone of Spectral Graph Theory that has numerous applications. Indeed, Cheeger's inequality handles graph conductance, the special case of product demands that are proportional to the vertex (capacitated) degrees. Along the way, we obtain an $O(\log n)$-approximation, where $n=\card{V}$, for the general-demands setting of Sparsest $st$-Cut.

Dependent rounding is a useful technique for optimization problems with hard budget constraints. This framework naturally leads to \emph{negative correlation} properties. However, what if an application naturally calls for dependent rounding on the one hand, and desires \emph{positive} correlation on the other? More generally, we develop algorithms that guarantee the known properties of dependent rounding, but also have nearly best-possible behavior -- near-independence, which generalizes positive correlation -- on ``small" subsets of the variables. The recent breakthrough of Li \& Svensson for the classical $k$-median problem has to handle positive correlation in certain dependent-rounding settings, and does so implicitly. We improve upon Li-Svensson's approximation ratio for $k$-median from $2.732 + \epsilon$ to $2.611 + \epsilon$ by developing an algorithm that improves upon various aspects of their work. Our dependent-rounding approach helps us improve the dependence of the runtime on the parameter $\epsilon$ from Li-Svensson's $N^{O(1/\epsilon^2)}$ to $N^{O((1/\epsilon) \log(1/\epsilon))}$.

In this paper, we study the uniform capacitated k-median problem. In the problem, we are given a set F of potential facility locations, a set C of clients, a metric d over F \cup C, an upper bound k on the number of facilities we can open and an upper bound u on the number of clients each facility can serve. We need to open a subset S \subseteq F of k facilities and connect clients in C to facilities in S so that each facility is connected by at most u clients. The goal is to minimize the total connection cost over all clients. Obtaining a constant approximation algorithm for this problem is a notorious open problem; most previous works gave constant approximations by either violating the capacity constraints or the cardinality constraint. Notably, all these algorithms are based on the natural LP-relaxation for the problem. The LP-relaxation has unbounded integrality gap, even when we are allowed to violate the capacity constraints or the cardinality constraint by a factor of 2-\eps. Our result is an \exp(O(1/\eps^2))-approximation algorithm for the problem that violates the cardinality constraint by a factor of 1+\eps. This is already beyond the capability of the natural LP relaxation, as it has unbounded integrality gap even if we are allowed to open (2-\eps)k facilities. Indeed, our result is based on a novel LP for this problem. We hope that this LP is the first step towards a constant approximation for capacitated k-median.

We consider the classic problem of envy-free division of a heterogeneous good ("cake") among several agents. It is known that, when the allotted pieces must be connected, the problem cannot be solved by a finite algorithm for 3 or more agents. Even when the pieces may be disconnected, no bounded-time algorithm is known for 5 or more agents. The impossibility result, however, assumes that the entire cake must be allocated. In this paper we replace the entire-allocation requirement with a weaker partial-proportionality requirement: the piece given to each agent must be worth for it at least a certain positive fraction of the entire cake value. We prove that this version of the problem is solvable in bounded time even when the pieces must be connected. We present bounded-time envy-free cake-cutting algorithms for: (1) giving each of $n$ agents a connected piece with a positive value; (2) giving each of 3 agents a connected piece worth at least 1/3; (3) giving each of 4 agents a connected piece worth at least 1/7; (4) giving each of 4 agents a disconnected piece worth at least 1/4; (5) giving each of $n$ agents a disconnected piece worth at least $(1-\epsilon)/n$ for any positive $\epsilon$.

This paper studies the set cover problem under the semi-streaming model. The underlying set system is formalized in terms of a hypergraph $G = (V, E)$ whose edges arrive one-by-one and the goal is to construct an edge cover $F \subseteq E$ with the objective of minimizing the cardinality (cost in the weighted case) of $F$. We consider a parameterized relaxation of this problem, where given some $0 \leq \epsilon < 1$, the goal is to construct an edge $(1 - \epsilon)$-cover, namely, a subset of edges incident to all but an $\epsilon$-fraction of the vertices (or their benefit in the weighted case). The key limitation imposed on the algorithm is that its space is limited to (poly)logarithmically many bits per vertex. Our main result is an asymptotically tight trade-off between $\epsilon$ and the approximation ratio: We design a semi-streaming algorithm that on input hypergraph $G$, constructs a succinct data structure $\mathcal{D}$ such that for every $0 \leq \epsilon < 1$, an edge $(1 - \epsilon)$-cover that approximates the optimal edge \mbox{($1$-)cover} within a factor of $f(\epsilon, n)$ can be extracted from $\mathcal{D}$ (efficiently and with no additional space requirements), where \[ f(\epsilon, n) = \left\{ \begin{array}{ll} O (1 / \epsilon), & \text{if } \epsilon > 1 / \sqrt{n} \\ O (\sqrt{n}), & \text{otherwise} \end{array} \right. \, . \] In particular for the traditional set cover problem we obtain an $O(\sqrt{n})$-approximation. This algorithm is proved to be best possible by establishing a family (parameterized by $\epsilon$) of matching lower bounds.

A {\em fault-tolerant} structure for a network is required to continue functioning following the failure of some of the network's edges or vertices. This paper considers {\em breadth-first search (BFS)} spanning trees, and addresses the problem of designing a sparse {\em fault-tolerant} BFS structure, or {\em FT-BFS structure} for short, namely, a sparse subgraph $T$ of the given network $G$ such that subsequent to the failure of a single edge or vertex, the surviving part $T'$ of $T$ still contains a BFS spanning tree for (the surviving part of) $G$. Our main results are as follows. We present an algorithm that for every $n$-vertex graph $G$ and source node $s$ constructs a (single edge failure) FT-BFS structure rooted at $s$ with $O(n \cdot \min\{Depth(s), \sqrt{n}\})$ edges, where $Depth(s)$ is the depth of the BFS tree rooted at $s$. This result is complemented by a matching lower bound. We then consider {\em fault-tolerant multi-source BFS structures}, or {\em FT-MBFS structure} for short, aiming to provide (following a failure) a BFS tree rooted at each source $s\in S$ for some subset of sources $S\subseteq V$. Finally, we propose an $O(\log n)$ approximation algorithm for constructing FT-BFS and FT-MBFS structures.

#### Algorithms for Hub Label Optimization

Maxim Babenko(); Andrew Goldberg(); Prof. Anupam Gupta(Carnegie Mellon University); Dr. Viswanath Nagarajan(University of Michigan)We propose a new approach to competitive analysis in online scheduling by introducing the novel concept of competitive-ratio approximation schemes. Such a scheme algorithmically constructs an online algorithm with a competitive ratio arbitrarily close to the best possible competitive ratio for any online algorithm. We study the problem of scheduling jobs online to minimize the weighted sum of completion times on parallel, related, and unrelated machines, and we derive both deterministic and randomized algorithms which are almost best possible among all online algorithms of the respective settings. We also generalize our techniques to arbitrary monomial cost functions and apply them to the makespan objective. Our method relies on an abstract characterization of online algorithms combined with various simplifications and transformations. We also contribute algorithmic means to compute the actual value of the best possible competitive ratio up to an arbitrary accuracy. This strongly contrasts nearly all previous manually obtained competitiveness results and, most importantly, it reduces the search for the optimal competitive ratio to a question that a computer can answer. We believe that our concept can also be applied to many other problems and yields a new perspective on online algorithms in general.

We study the Minimum Latency Submodular Cover problem (MLSC), which consists of a metric $(V,d)$ with source $r\in V$ and $m$ monotone submodular functions $f_1, f_2, ..., f_m: 2^V \rightarrow [0,1]$. The goal is to find a path originating at $r$ that minimizes the total cover time of all functions. This generalizes well-studied problems, such as Submodular Ranking [AzarG11] and Group Steiner Tree [GargKR00]. We give a polynomial time $O(\log \frac{1}{\eps} \cdot \log^{2+\delta} |V|)$-approximation algorithm for MLSC, where $\epsilon>0$ is the smallest non-zero marginal increase of any $\{f_i\}_{i=1}^m$ and $\delta>0$ is any constant. We also consider the Latency Covering Steiner Tree problem (LCST), which is the special case of \mlsc where the $f_i$s are multi-coverage functions. This is a common generalization of the Latency Group Steiner Tree [GuptaNR10, ChakrabartyS11] and Generalized Min-sum Set Cover [AzarGY09, BansalGK10] problems. We obtain an $O(\log^2|V|)$-approximation algorithm for LCST. Finally we study a natural stochastic extension of the Submodular Ranking problem, and obtain an adaptive algorithm with an $O(\log 1/ \eps)$ approximation ratio, which is best possible. This result also generalizes some previously studied stochastic optimization problems, such as Stochastic Set Cover [GoemansV06] and Shared Filter Evaluation [MunagalaSW07,LiuPRY08].

Edge and vertex connectivity are fundamental concepts in graph theory. While they have been thoroughly studied in the case of undirected graphs, surprisingly not much has been investigated for directed graphs. In this paper we study 2-edge connectivity problems in directed graphs and, in particular, we consider the computation of the following natural relation: We say that two vertices v and w are 2-edge-connected if there are two edge-disjoint paths from v to w and two edge-disjoint paths from w to v. This relation partitions the vertices into blocks such that all vertices in the same block are 2-edge-connected. Differently from the undirected case, those blocks do not correspond to the 2-edge-connected components of the graph. The main result of this paper is an algorithm for computing the 2-edge-connected blocks of a directed graph in linear time. Besides being asymptotically optimal, our algorithm improves significantly over previous bounds. Once the 2-edge-connected blocks are available, we can test in constant time if two vertices are 2-edge-connected. Additionally, when two query vertices v and w are not 2-edge-connected, we can produce in constant time a witness of this property. We are also able to compute in linear time a sparse certificate for this relation, i.e., a subgraph of the input graph that has O(n) edges and maintains the same 2-edge-connected blocks as the input graph, where n is the number of vertices.

Multi-Pivot Quicksort refers to variants of classical quicksort where in the partitioning step k pivots are used to split the input into k + 1 segments. For many years, multi-pivot quicksort was regarded as impractical, but in 2009 a 2-pivot approach by Yaroslavskiy, Bentley, and Bloch was chosen as the standard sorting algorithm in Sun's Java 7. In 2014 at ALENEX, Kushagra et al. introduced an even faster algorithm that uses three pivots. This paper studies what possible advantages multi-pivot quicksort might offer in general. The contributions are as follows: Natural comparison-optimal algorithms for multi-pivot quicksort are devised and analyzed. The analysis shows that the benefits of using multiple pivots with respect to the average comparison count are marginal and these strategies are inferior to simpler strategies such as the well known median-of-k approach. A substantial part of the partitioning cost is caused by rearranging elements. A rigorous analysis of an algorithm for rearranging elements in the partitioning step is carried out, observing mainly how often array cells are accessed during partitioning. The algorithm behaves best if 3 or 5 pivots are used. Experiments show that this translates into good cache behavior and is closest to predicting observed running times of multi-pivot quicksort algorithms. Finally, it is studied how choosing pivots from a sample affects sorting cost.

#### Better Balance by Being Biased: A 0.8776-Approximation for Max Bisection

Austrin, Per ; Benabbas, Siavosh ; Georgiou, KonstantinosThe dynamic facility location problem is a generalization of the classic facility location problem proposed by Eisenstat, Mathieu, and Schabanel to model the dynamics of evolving social/infrastructure networks. The generalization lies in that the distance metric between clients and facilities changes over time. This leads to a trade-off between optimizing the classic objective function and the "stability" of the solution: there is a switching cost charged every time a client changes the facility to which it is connected. While the standard linear program (LP) relaxation for the classic problem naturally extends to this problem, traditional LP-rounding techniques do not, as they are often sensitive to small changes in the metric resulting in frequent switches. We present a new LP-rounding algorithm for facility location problems, which yields the first constant approximation algorithm for the dynamic facility location problem. Our algorithm installs competing exponential clocks on the clients and facilities, and connect every client by the path that repeatedly follows the smallest clock in the neighborhood. The use of exponential clocks gives rise to several properties that distinguish our approach from previous LP-roundings for facility location problems. In particular, we use no clustering and we allow clients to connect through paths of arbitrary lengths. In fact, the clustering-free nature of our algorithm is crucial for applying our LP-rounding approach to the dynamic problem.