We consider the maximization problem in the value oracle model of functions defined on k-tuples of sets that are submodular in every orthant and r-wise monotone, where k>=2 and 1<=r<=k. We give an analysis of a deterministic greedy algorithm that shows that any such function can be approximated to a factor of 1/(1+r). For r = k, we give an analysis of a randomised greedy algorithm that shows that any such function can be approximated to a factor of 1/(1+Sqrt(k/2)). In the case of k=r=2, the considered functions correspond precisely to bisubmodular functions, in which case we obtain an approximation guarantee of 1/2. We show that, as in the case of submodular functions, this result is the best possible in both the value query model, and under the assumption that NP!=RP. Extending a result of Ando et al., we show that for any k>=3 submodularity in every orthant and pairwise monotonicity (i.e. r=2) precisely characterize k-submodular functions. Consequently, we obtain an approximation guarantee of 1/3 (and thus independent of k) for the maximization problem of k-submodular functions.

#### Data Structures for Path Queries

He, Meng ; Munro, J. Ian ; Zhou, GelinNearest-neighbor (NN) search, which returns the nearest neighbor of a query point in a set of points, is an important and widely studied problem in many fields, and it has wide range of applications. In many of them, such as sensor databases, location-based services, face recognition, and mobile data, the location of data is imprecise. We therefore study nearest neighbor queries in a probabilistic framework in which the location of each input point is specified as a probability density function. We present efficient algorithms for (i) computing all points that are nearest neighbors of a query point with nonzero probability; (ii) estimating, within a specified additive error, the probability of a point being the nearest neighbor of a query point; (iii) using it to return the point that maximizes the probability being the nearest neighbor, or all the points with probabilities greater than some threshold to be the NN.

We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the $w$-bit word RAM model. - It is well known that there are linear-space data structures for 2-D orthogonal range counting with worst-case optimal query time O(log n / loglog n). We give an O(n loglog n)-space adaptive data structure that improves the query time to O(loglog n+log k / loglog n), where k is the output count. When k=O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem [Chan, Larsen, and Patrascu, SoCG 2011]. - We give an O(n loglog n)-space data structure for approximate 2-D orthogonal range counting that can compute a (1+delta)-factor approximation to the count in O(loglog n) time for any fixed constant delta>0. Again, our bounds match the state of the art for the 2-D orthogonal range emptiness problem. - Lastly, we consider the 1-D range selection problem, where a query in an array involves finding the kth least element in a given subarray. This problem is closely related to 2-D 3-sided orthogonal range counting. Recently, Jorgensen and Larsen [SODA 2011] presented a linear-space adaptive data structure with query time O(loglog n + log k / loglog n). We give a new linear-space structure that improves the query time to O(1 + log k / loglog n), exactly matching the lower bound proved by Jorgensen and Larsen.

#### Compressed Cache-Oblivious String B-tree

Ferragina, Paolo ; Venturini, RossanoWe consider the {\em matroid median} problem~\cite{KrishnaswamyKNSS11}, wherein we are given a set of facilities with opening costs and a matroid on the facility-set, and clients with demands and connection costs, and we seek to open an independent set of facilities and assign clients to open facilities so as to minimize the sum of the facility-opening and client-connection costs. We give a simple 8-approximation algorithm for this problem based on LP-rounding, which improves upon the 16-approximation in~\cite{KrishnaswamyKNSS11}. We illustrate the power and versatility of our techniques by deriving: (a) an 8-approximation for the {\em two-matroid median} problem, a generalization of matroid median that we introduce involving two matroids; and (b) a 24-approximation algorithm for {\em matroid median with penalties}, which is a vast improvement over the 360-approximation obtained in~\cite{KrishnaswamyKNSS11}. We show that a variety of seemingly disparate facility-location problems considered in the literature---data placement problem, mobile facility location, $k$-median forest, metric uniform minimum-latency {\footnotesize \textsf{UFL}}---in fact reduce to the matroid median or two-matroid median problems, and thus obtain {\em improved} approximation guarantees for all these problems. Our techniques also yield an improvement for the knapsack median problem.

#### Inapproximability of the Multi-level Uncapacitated Facility Location Problem

Krishnaswamy, Ravishankar ; Sviridenko, MaximWe study the complexity of the Channel Assignment problem. An open problem asks whether Channel Assignment admits an $O(c^n)$-time algorithm, for a constant $c$ independent of the weights on the edges. We answer this question in the negative i.e. we show that there is no $2^{o(n\log n)}$-time algorithm solving Channel Assignment unless the Exponential Time Hypothesis fails. Note that the currently best known algorithm works in time $O^*(n!) = 2^{O(n\log n)}$ so our lower bound is tight.

This paper presents new algorithms for two problems related to the Miller-Rabin-Selfridge primality test. The first problem is to tabulate strong pseudoprimes to a fixed base $a$. Tabulating up to $x$ requires $O(x)$ multiplications, where previous methods required $O(x \log{x})$ multiplications. The second problem is to find all strong liars and witnesses, given a fixed odd composite $n$. This appears to be unstudied, and an algorithm is presented that requires $O(n (\log\log{n})^2)$ multiplications. Although interesting in their own right, a notable application is the search for sets of composites with no reliable witness.

We address the vexing issue of deletions in balanced trees. Rebalancing after a deletion is generally more complicated than rebalancing after an insertion. Textbooks neglect deletion rebalancing, and many B-tree-based database systems do not do it.
We describe a relaxation of AVL trees in which rebalancing is done after insertions but not after deletions, yet access time remains logarithmic in the number of insertions. For many applications of balanced trees, our structure offers performance competitive with that of classical balanced trees. With the addition of periodic rebuilding, the performance of our structure is theoretically superior to that of many if not all classic balanced tree structures. Our structure needs lglg(*m*) + 1 bits of balance information per node, where *m* is the number of insertions and lg is the base-two logarithm, or lglg(*n*) + O(1) with periodic rebuilding, where *n* is the number of nodes. An insertion takes up to two rotations and O(1) amortized time, the same as in standard AVL trees. Using an analysis that relies on an exponential potential function, we show that rebalancing steps occur with a frequency that is exponentially small in the height of the affected node. Our techniques apply to other types of balanced trees, notably B-trees, as we show in a companion paper, and in particular red-black trees, which can be viewed as a special case of B-trees.

2-Opt is a simple local search heuristic for the traveling salesperson problem, which performs very well in experiments both with respect to running time and solution quality. In contrast to this, there are instances on which 2-Opt may need an exponential number of steps to reach a local optimum. To understand why 2-Opt usually finds local optima quickly in experiments, we study its expected running time in the model of smoothed analysis, which can be considered as a less pessimistic variant of worst-case analysis in which the adversarial input is subject to a small amount of random noise. In our probabilistic input model an adversary chooses an arbitrary graph~$G$ and additionally a probability density function for each edge according to which its length is chosen. We prove that in this model the expected number of local improvements is~$O(mn\phi(\log m)^3\cdot 4^{3\sqrt{\ln{m}}})=m^{1+o(1)}n\phi$, where~$n$ and~$m$ denote the number of vertices and edges of~$G$, respectively, and~$\phi$ denotes an upper bound on the density functions.

We study streaming algorithms for the interval selection problem: finding a maximum cardinality subset of disjoint intervals on the line. A deterministic $2$-approximation streaming algorithm for this problem is developed, together with an algorithm for the special case of proper intervals, achieving improved approximation ratio of $3/2$. We complement these upper bounds by proving that they are essentially best possible in the streaming setting: it is shown that an approximation ratio of $2 - \epsilon$ (or $3 / 2 - \epsilon$ for proper intervals) cannot be achieved unless the space is linear in the input size. In passing, we also answer an open question of Adler and Azar (J.\ Scheduling 2003) regarding the space complexity of constant-competitive randomized preemptive online algorithms for the same problem.

This paper studies the set cover problem under the semi-streaming model. The underlying set system is formalized in terms of a hypergraph $G = (V, E)$ whose edges arrive one-by-one and the goal is to construct an edge cover $F \subseteq E$ with the objective of minimizing the cardinality (cost in the weighted case) of $F$. We consider a parameterized relaxation of this problem, where given some $0 \leq \epsilon < 1$, the goal is to construct an edge $(1 - \epsilon)$-cover, namely, a subset of edges incident to all but an $\epsilon$-fraction of the vertices (or their benefit in the weighted case). The key limitation imposed on the algorithm is that its space is limited to (poly)logarithmically many bits per vertex. Our main result is an asymptotically tight trade-off between $\epsilon$ and the approximation ratio: We design a semi-streaming algorithm that on input hypergraph $G$, constructs a succinct data structure $\mathcal{D}$ such that for every $0 \leq \epsilon < 1$, an edge $(1 - \epsilon)$-cover that approximates the optimal edge \mbox{($1$-)cover} within a factor of $f(\epsilon, n)$ can be extracted from $\mathcal{D}$ (efficiently and with no additional space requirements), where \[ f(\epsilon, n) = \left\{ \begin{array}{ll} O (1 / \epsilon), & \text{if } \epsilon > 1 / \sqrt{n} \\ O (\sqrt{n}), & \text{otherwise} \end{array} \right. \, . \] In particular for the traditional set cover problem we obtain an $O(\sqrt{n})$-approximation. This algorithm is proved to be best possible by establishing a family (parameterized by $\epsilon$) of matching lower bounds.

#### An Improved Approximation Algorithm for the Edge-Disjoint Paths Problem with Congestion Two

Kawarabayashi, Ken-ichi ; Kobayashi, Yusuke#### Approximation Algorithms for Movement Repairmen

Hajiaghayi, MohammadTaghi ; Khandekar, Rohit ; Khani, Mohammad Reza ; Kortsarz, Guy#### Agnostic learning in permutation invariant domains

Karl Wimmer (Duquesne University - Mathematics)Edge and vertex connectivity are fundamental concepts in graph theory. While they have been thoroughly studied in the case of undirected graphs, surprisingly not much has been investigated for directed graphs. In this paper we study 2-edge connectivity problems in directed graphs and, in particular, we consider the computation of the following natural relation: We say that two vertices v and w are 2-edge-connected if there are two edge-disjoint paths from v to w and two edge-disjoint paths from w to v. This relation partitions the vertices into blocks such that all vertices in the same block are 2-edge-connected. Differently from the undirected case, those blocks do not correspond to the 2-edge-connected components of the graph. The main result of this paper is an algorithm for computing the 2-edge-connected blocks of a directed graph in linear time. Besides being asymptotically optimal, our algorithm improves significantly over previous bounds. Once the 2-edge-connected blocks are available, we can test in constant time if two vertices are 2-edge-connected. Additionally, when two query vertices v and w are not 2-edge-connected, we can produce in constant time a witness of this property. We are also able to compute in linear time a sparse certificate for this relation, i.e., a subgraph of the input graph that has O(n) edges and maintains the same 2-edge-connected blocks as the input graph, where n is the number of vertices.

Given an $n$-vertex undirected graph $G = (V,E)$ and a parameter $k = 1,2,\ldots$, Thorup-Zwick's distance oracle has size $O(k n^{1+1/k})$, and upon a query $(u,v)$ it constructs a path $\Pi$ between $u$ and $v$ of length $\delta(u,v)$ such that $d_G(u,v) \le \delta(u,v) \le (2k-1) d_G(u,v)$. The query time of this oracle is $O(k)$ (in addition to the length of the returned path), and it was subsequently improved to $O(1)$ \cite{WN12,C14}. A major drawback of the oracle of Thorup and Zwick is that its space is $\Omega(n \cdot \log n)$. Mendel and Naor devised an oracle with space $O(n^{1+1/k})$ and stretch $O(k)$, but their oracle can only report distance estimates and not actual paths. In this paper we devise a path-reporting distance oracle with size $O(n^{1+1/k})$, stretch $O(k)$ and query time $O(n^\eps)$, for an arbitrarily small $\eps > 0$. Another variant of our oracle has size $O(n \log\log n)$, polylogarithmic stretch, and query time $O(\log\log n)$. For unweighted graphs we devise a distance oracle with multiplicative stretch $O(1)$, additive stretch $O(\beta(k))$, for a function $\beta()$, space $O(n^{1+1/k} \cdot \beta)$, and query time $O(n^\eps)$, for an arbitrarily small constant $\eps >0$. The tradeoff between multiplicative stretch and size in these oracles is far below girth conjecture threshold (which is stretch $2k-1$ and size $O(n^{1+1/k})$). An important novel tool that we develop on the way to these results is a {distance-preserving path-reporting oracle}. We believe that this oracle is of independent interest.

Multi-Pivot Quicksort refers to variants of classical quicksort where in the partitioning step k pivots are used to split the input into k + 1 segments. For many years, multi-pivot quicksort was regarded as impractical, but in 2009 a 2-pivot approach by Yaroslavskiy, Bentley, and Bloch was chosen as the standard sorting algorithm in Sun's Java 7. In 2014 at ALENEX, Kushagra et al. introduced an even faster algorithm that uses three pivots. This paper studies what possible advantages multi-pivot quicksort might offer in general. The contributions are as follows: Natural comparison-optimal algorithms for multi-pivot quicksort are devised and analyzed. The analysis shows that the benefits of using multiple pivots with respect to the average comparison count are marginal and these strategies are inferior to simpler strategies such as the well known median-of-k approach. A substantial part of the partitioning cost is caused by rearranging elements. A rigorous analysis of an algorithm for rearranging elements in the partitioning step is carried out, observing mainly how often array cells are accessed during partitioning. The algorithm behaves best if 3 or 5 pivots are used. Experiments show that this translates into good cache behavior and is closest to predicting observed running times of multi-pivot quicksort algorithms. Finally, it is studied how choosing pivots from a sample affects sorting cost.

In computing, as in many aspects of life, changes incur cost. Many optimization problems are formulated as a one-time instance starting from scratch. However, a common case that arises is when we already have a set of prior assignments, and must decide how to respond to a new set of constraints, given that each change from the current assignment comes at a price. That is, we would like to maximize the fitness or efficiency of our system, but we need to balance it with the changeout cost from the previous state. We provide a precise formulation for this tradeoff and analyze the resulting {\em stable extensions} of some fundamental problems in measurement and analytics. Our main technical contribution is a stable extension of PPS (probability proportional to size) weighted random sampling, with applications to monitoring and anomaly detection problems. We also provide a general framework that applies to top-k, minimum spanning tree, and assignment. In both cases, we are able to provide exact solutions, and discuss efficient incremental algorithms that can find new solutions as the input changes.

#### Better Balance by Being Biased: A 0.8776-Approximation for Max Bisection

Austrin, Per ; Benabbas, Siavosh ; Georgiou, KonstantinosThe dynamic facility location problem is a generalization of the classic facility location problem proposed by Eisenstat, Mathieu, and Schabanel to model the dynamics of evolving social/infrastructure networks. The generalization lies in that the distance metric between clients and facilities changes over time. This leads to a trade-off between optimizing the classic objective function and the "stability" of the solution: there is a switching cost charged every time a client changes the facility to which it is connected. While the standard linear program (LP) relaxation for the classic problem naturally extends to this problem, traditional LP-rounding techniques do not, as they are often sensitive to small changes in the metric resulting in frequent switches. We present a new LP-rounding algorithm for facility location problems, which yields the first constant approximation algorithm for the dynamic facility location problem. Our algorithm installs competing exponential clocks on the clients and facilities, and connect every client by the path that repeatedly follows the smallest clock in the neighborhood. The use of exponential clocks gives rise to several properties that distinguish our approach from previous LP-roundings for facility location problems. In particular, we use no clustering and we allow clients to connect through paths of arbitrary lengths. In fact, the clustering-free nature of our algorithm is crucial for applying our LP-rounding approach to the dynamic problem.