Bloom filter based on redis(key segment to avoid too large a key) and db (to solve hash collision problem)

Keywords: Jedis Redis Java Database

1. Calculate the hash value of the key.
2. The offset offset is calculated according to hash value and fixed segment size.
3. The bitKey of the segment is calculated according to the fixed pre-hash value/the size of the fixed segment.
4. Judge whether it exists according to bitKey and offset.
5. If it exists, then call containsFromDb to determine whether it exists.
6. Segmenting redis setbit can avoid too much data for a single key.
7. If redis is a cluster, segments can also be computed according to the probability of jedis crc16 algorithm.
On each node, avoid overheating of a single node.

Code example:
package six.com.crawler.work.space;

import java.util.Objects;

import redis.clients.jedis.Jedis;

public class RedisAndDbBloomFilter {

private String nameSpace;
private Jedis jedis;
private int fixSize;

public RedisAndDbBloomFilter(String nameSpace,Jedis jedis,int fixSize){
    this.nameSpace=nameSpace;
    this.jedis=jedis;
    this.fixSize=fixSize;
}

private int getHash(String key){
    return key.hashCode();
}

private void addToDb(int hash,String key){
    //TODO saves records to db
}

private boolean containsFromDb(int hash,String key){
    //TODO queries the existence of the database based on hash key
    return false;
}

/**
 * Calculate bitKey based on hash and fixSize
 * @param hash
 * @return
 */
private String getBitKey(int hash){
    int bitKeyIndex=hash/fixSize;
    String bitKey=nameSpace+bitKeyIndex;
    return bitKey;
}
/**
 * Determine whether a given key exists
 * @param key
 * @return
 */
public boolean contains(String key){
    //TODO needs distributed locks if it is in cluster mode, and thread locks if it is in stand-alone mode.
    Objects.requireNonNull(key, "the key must not be null");
    int hash=getHash(key);
    int offset=hash%fixSize;
    String bitKey=getBitKey(hash);
    Boolean result=jedis.getbit(bitKey,offset);
    if(result.booleanValue()&&containsFromDb(hash, key)){
        return true;
    }
    return false;
}

/**
 * Add a filter record based on the given key
 * @param key
 */
public void addRecord(String key){
    //TODO needs distributed locks if it is in cluster mode, and thread locks if it is in stand-alone mode.
    int hash=getHash(key);
    int offset=hash%fixSize;
    String bitKey=getBitKey(hash);
    jedis.setbit(bitKey,offset, true);
    addToDb(hash, key);
}    

}

Posted by PerfecTiion on Wed, 13 Feb 2019 09:15:18 -0800