Hardness of bichromatic closest pair with Jaccard similarity

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Documents

Consider collections A and B of red and blue sets, respectively. Bichromatic Closest Pair is the problem of finding a pair from A × B that has similarity higher than a given threshold according to some similarity measure. Our focus here is the classic Jaccard similarity |a ∩ b|/|a ∪ b| for (a, b) ∈ A × B. We consider the approximate version of the problem where we are given thresholds j1 > j2 and wish to return a pair from A × B that has Jaccard similarity higher than j2 if there exists a pair in A × B with Jaccard similarity at least j1. The classic locality sensitive hashing (LSH) algorithm of Indyk and Motwani (STOC’98), instantiated with the MinHash LSH function of Broder et al., solves this problem in Õ(n2−δ) time if j1 ≥ j21−δ. In particular, for δ = Ω(1), the approximation ratio j1/j2 = 1/j2δ increases polynomially in 1/j2. In this paper we give a corresponding hardness result. Assuming the Orthogonal Vectors Conjecture (OVC), we show that there cannot be a general solution that solves the Bichromatic Closest Pair problem in O(n2Ω(1)) time for j1/j2 = 1/j2o(1). Specifically, assuming OVC, we prove that for any δ > 0 there exists an ε > 0 such that Bichromatic Closest Pair with Jaccard similarity requires time Ω(n2−δ) for any choice of thresholds j2 < j1 < 1 − δ, that satisfy j1 ≤ j21ε

Original languageEnglish
Title of host publication27th Annual European Symposium on Algorithms, ESA 2019
EditorsMichael A. Bender, Ola Svensson, Grzegorz Herman
Number of pages13
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Publication date2019
Article number74
ISBN (Electronic)9783959771245
DOIs
Publication statusPublished - 2019
Event27th Annual European Symposium on Algorithms, ESA 2019 - Munich/Garching, Germany
Duration: 9 Sep 201911 Sep 2019

Conference

Conference27th Annual European Symposium on Algorithms, ESA 2019
LandGermany
ByMunich/Garching
Periode09/09/201911/09/2019
SeriesLeibniz International Proceedings in Informatics, LIPIcs
Volume144
ISSN1868-8969

    Research areas

  • Bichromatic closest pair, Fine-grained complexity, Jaccard similarity, Set similarity search

ID: 238368920