Abstract:
Bloom filter based algorithms have proven successful as
very efficient technique to reduce communication costs of
database joins in a distributed setting. However, the full
potential of bloom filters has not yet been exploited. Especially
in the case of multi-joins, where the data is distributed
among several sites, additional optimization opportunities
arise, which require new bloom filter operations and computations.
In this paper, we present these extensions and
point out how they improve the performance of such distributed
joins. While the paper focuses on efficient join
computation, the described extensions are applicable to a
wide range of usages, where bloom filters are facilitated for
compressed set representation.
Keywords: Distributed Databases, Bloom filter Operations, Bloomfilters