Implementing a simple key-value database with php

Keywords: Database PHP github

Recently, in reading "php Core Technologies and Best Practices", the first part is just a general understanding. But when I read the chapter of Hash algorithm and database implementation, I was interested in an example of using php to implement a simple key-value database. After careful reading, I feel not addicted, so I imitate the examples given to practice once, to consolidate the content of learning, while deepening the understanding of data.

1. Hash index and Hash value conflict are realized in this paper by separating linked list method.
2. The index type implemented is non-clustered index, i.e. the separation of index files and data files (saved in index and data folders respectively in the example)
3. The data structure of a single node in the Hash table is as follows: the location of the next index (4) + keyValue(128) + the location of the data file (4) + the length of the data (4);
4. The deletion operation only deletes the index, and the corresponding value of this key still exists in the data file. If the data in the data file is deleted at the same time during the deletion operation, it needs to refresh the data pointers of all nodes in the index file, which is too time-consuming.

The following is the implementation code:
I. Database Connection

<?php

/*
*
*   Index files use pack and unpack functions to convert index file data to binary and from binary to specific values
*
*   Create data and index folders under the current folder, grant it read and write permissions, and simply grant 777 permissions directly.
*
*   $db = new MyDataBase();
*
*    //Database Connection
*   $dbHandler = $db->connect('dbTest');
*
*   //Write data, only key-value form is supported
*   $dbHandler->insert('key1', '1111111'); 
*
*   //Data search
*   $dbHandler->find('key1'); 
*
*   //Delete data
*   $dbHandler->delete('key1');  
*
*   //Close Data Connection
*   $db->close(); 
*
*/
define('DB_INSERT_SUCCESS', 'SUCCESS');
define('DB_INSERT_FAILED', 'FAILED');
define('DB_DELETE_SUCCESS', 'SUCCESS');
define('DB_DELETET_FAILED', 'FAILED');
define('DB_EXISTS_KEY', 'KEYEXISTS');

//This defines the length of the linked list in the Hash table.
define('DB_BUCKET_SIZE', 262144);

//Storage key value size
define('DB_KEY_SIZE', 128);

//The length of an index node is defined. An index node includes: 
//4 bytes to the next index node value + 128 bytes to the Key value+ 
//4-byte data offset value in data file + 4-byte data length value 
define('DB_INDEX_SIZE', DB_KEY_SIZE + 12); 

include './DataBaseObject.php';

class MyDataBase
{
    private $dataHandler;
    private $indexHandler;
    private $closeFlag = false;

    public function connect($databaseName)
    {
        try {

            $indexFile = './index/'.$databaseName.'.inx';
            $dataFile = './data/'.$databaseName.'.dat';

            $needInit = false;

            if (!file_exists($indexFile)) {
                $openModel = 'w+b';
                $needInit = true;
            } else {
                $openModel = 'r+b';
            }

            $this->indexHandler = fopen($indexFile, $openModel);

            //Initialize the index file and initialize it with 0.
            if ($needInit) {
                $initValue = pack('L', 0x00000000);
                for ($i = 0; $i < DB_BUCKET_SIZE; $i++) {
                    fwrite($this->indexHandler, $initValue, 4);
                }
            }

            $this->dataHandler = fopen($dataFile, $openModel);

            $dataBase = new DataBaseObject($this->dataHandler, $this->indexHandler);

        } catch (Exception $e) {
            return NULL;
        }
        return $dataBase;
    }

    public function close()
    {
        try {
            if (!$this->closeFlag) {
                fclose($this->dataHandler);
                fclose($this->indexHandler);
                $this->closeFlag = true;
            }
        } catch (Exception $e) {
            return false;
        }
        return true;
    }
}

//Here's an example of how to use it
$db = new MyDataBase();

$dbHandler = $db->connect('dbTest');

$dbHandler->insert('key1', '1111111');
var_dump($dbHandler->find('key1'));

$dbHandler->delete('key1');
var_dump($dbHandler->find('key1'));


$db->close();

II. Data sheet operation section

<?php

/*
*
*   The hash function used is hashFunc, which first uses md5 to 32 bits.
*   Then the first eight bits of the 32-bit string are calculated by using Times 33 to get the hash value.
*
*/
class DataBaseObject
{
    private $dataHandler;
    private $indexHandler;

    public function __construct($dataHandler, $indexHandler)
    {
        $this->dataHandler = $dataHandler;
        $this->indexHandler = $indexHandler;
    }

    public function insert($key, $data)
    {
        $offset = $this->hashFunc($key) % DB_BUCKET_SIZE * 4;
        //Get the offset of the next available disk address
        $indexOffset = fstat($this->indexHandler);
        $indexOffset = intval($indexOffset['size']);

        $dataOffset = fstat($this->dataHandler);
        $dataOffset = intval($dataOffset['size']);

        $keyLen = strlen($key);

        if ($keyLen > DB_KEY_SIZE) {
            return DB_INSERT_FAILED;
        }

        //The next node that the new node points to is 0
        $dataBlock = pack('L', 0x00000000);
        $dataBlock .= $key;
        $space = DB_KEY_SIZE - $keyLen;
        for($i = 0; $i < $space; $i++) {
            //The length is not enough. Complete it with 0.
            $dataBlock .= pack('C', 0x00);
        }

        //Offset of new data in data files
        $dataBlock .= pack('L', $dataOffset);
        //New data length
        $dataBlock .= pack('L', strlen($data));

        fseek($this->indexHandler, $offset, SEEK_SET);
        $position = unpack('L', fread($this->indexHandler, 4));
        $position = $position[1];

        //If the hash value never appears, it acts directly as the header node
        if ($position == 0) {
            fseek($this->indexHandler, $offset, SEEK_SET);
            //Header node points to current node
            fwrite($this->indexHandler, pack('L', $indexOffset), 4);
            fseek($this->indexHandler, 0, SEEK_END);
            //Write to the current index node
            fwrite($this->indexHandler, $dataBlock, DB_INDEX_SIZE);
            fseek($this->dataHandler, 0, SEEK_END);
            //Write new data to a data file
            fwrite($this->dataHandler, $data, strlen($data));
            return DB_INSERT_SUCCESS;
        }

        $foundFlag = false;

        while ($position) {
            fseek($this->indexHandler, $position, SEEK_SET);
            //Gets the value of the current index node
            $tmpBlock = fread($this->indexHandler, DB_INDEX_SIZE);
            $currentKey = substr($tmpBlock, 4, DB_KEY_SIZE);
            //Because the index file is a binary value, the strncmp function is used to compare whether it is equal.
            if (!strncmp($key, $currentKey, strlen($key))) {
            //The offset of the data pointed by the current index in the data file
                $dataOff = unpack('L', substr($tmpBlock, DB_KEY_SIZE + 4, 4));
                $dataOff = $dataOff[1];
                //Length of data pointed to by the current index
                $dataLe = unpack('L', substr($tmpBlock, DB_KEY_SIZE + 8, 4));
                $dataLe = $dataLe[1];
                $foundFlag = true;
                break;
            }
            $prev = $position;
            $position = unpack('L', substr($tmpBlock, 0, 4));
            $position = $position[1];
        }

        if ($foundFlag) {
            return DB_EXISTS_KEY;
        }

        fseek($this->indexHandler, $prev, SEEK_SET);
        //The previous node points to the current node
        fwrite($this->indexHandler, pack('L', $indexOffset), 4);
        fseek($this->indexHandler, 0, SEEK_END);
        //// Write to the current index node
        fwrite($this->indexHandler, $dataBlock, DB_INDEX_SIZE);
        fseek($this->dataHandler, 0, SEEK_END);
        //Write new data to a data file
        fwrite($this->dataHandler, $data, strlen($data));
        return DB_INSERT_SUCCESS;
    }

    public function find($key)
    {   
        $offset = $this->hashFunc($key) % DB_BUCKET_SIZE * 4;
        fseek($this->indexHandler, $offset, SEEK_SET);
        $position = unpack('L', fread($this->indexHandler, 4));
        $position = $position[1];

        $foundFlag = false;
        while ($position) {
            fseek($this->indexHandler, $position, SEEK_SET);
            $indexBlock = fread($this->indexHandler, DB_INDEX_SIZE);
            $currentKey = substr($indexBlock, 4, DB_KEY_SIZE);
            if (!strncmp($currentKey, $key, strlen($key))) {
                $dataOffset = unpack('L', substr($indexBlock, DB_KEY_SIZE + 4, 4));
                $dataOffset = $dataOffset[1];

                $dataLen = unpack('L', substr($indexBlock, DB_KEY_SIZE + 8, 4));
                $dataLen = $dataLen[1];

                $foundFlag = true;
                break;
            }

            $position = unpack('L', substr($indexBlock, 0, 4));
            $position = $position[1];
        }

        if ($foundFlag) {
            fseek($this->dataHandler, $dataOffset, SEEK_SET);
            $data = fread($this->dataHandler, $dataLen);
            return $data;
        } else {
            return NULL;
        }


    }

    public function delete($key)
    {
        $offset = $this->hashFunc($key) % DB_BUCKET_SIZE * 4;
        fseek($this->indexHandler, $offset, SEEK_SET);
        $head = unpack('L', fread($this->indexHandler, 4));
        $head = $head[1];

        $current = $head;
        $prev = 0;

        $foundFlag = false;

        while ($current) {
            fseek($this->indexHandler, $current, SEEK_SET);
            $dataBlock = fread($this->indexHandler, DB_INDEX_SIZE);

            $currentKey = substr($dataBlock, 4, DB_KEY_SIZE);

            $next = unpack('L', substr($dataBlock, 0, 4));
            $next = $next[1];

            if (!strncmp($key, $currentKey, strlen($key))) {
                $foundFlag = true;
                break;
            }

            $prev = $current;
            $current = $next;
        }

        if (!$foundFlag) {
            return DB_DELETET_FAILED;
        }

        if ($prev == 0) {
            fseek($this->indexHandler, $offset, SEEK_SET);
        } else {
            fseek($this->indexHandler, $prev, SEEK_SET);
        }

        //A pointer that points the previous index to the next index
        //Point to the next index to be deleted.
        //This completes the deletion operation without directly deleting the values in the index and data files.
        fwrite($this->indexHandler, pack('L', $next), 4);

        return DB_DELETE_SUCCESS;
    }

    protected function hashFunc($str)
    {
        $str = substr(md5($str), 0, 8);
        $hashValue = 0;

        for ($i = 0; $i < 8; $i++) {
            $hashValue += 33 * $hashValue + ord($str[$i]);
        }

        return $hashValue & 0x7FFFFFFF;
    }
}

github source address

Posted by inztinkt on Thu, 16 May 2019 15:19:50 -0700