laravel's mysql split table and split table associated query solution

Keywords: PHP MySQL Laravel

0X00 problem appears

How does laravel of php implement horizontal sub table? Recently, there was a project in hand. The framework used the laravel7 framework of php. Due to the surge in the amount of data, 300000 rows of data were produced in one night for the details of rapid iterative development, and the main table had 12 million data. Because this is only used to store and record user video details, it is decided to use the monthly horizontal sub table.

In the early stage, the union all was directly used to join the tables, resulting in the SQL not going to the index. At 6:00 in the morning, the customer frantically sent a message that the production was suspended. The customer quickly checked and found that all SQL were scanning the whole table, killing the SQL server. The statement is as follows

select * from (select * from video_time_info_details union all select * from video_time_info_details_10_1) where xxxx

Because the xxx condition is after the Union, the index will not be used, so it hangs. Therefore, the sub table is reconstructed

0x01 reconstructing split table logic

1. model insert and update sub table

The updated sub table is very simple and can be used directly in__ In the construct construction method, add a month suffix to the table to judge whether the following table exists. If it does not exist, it will be created, and if it exists, it will be inserted directly. Because it belongs to the detail table, it is called very frequently, so it is cached to avoid querying the database every time to judge whether there is a table. The code is as follows

class VideoTimeInfoDetails extends Model
{
    use SerializeDate, SplitTableTrait;

    protected $table;

    public function __construct(array $attributes = [])
    {
        parent::__construct($attributes);
        // Manually divide the table by month
        $this->table = 'video_time_info_details_' . date('Y_m') . "_1"; // Now the sub table is saved once a month. Suffix plus_ 1. The purpose is to quickly avoid the problem that the table cannot be linked caused by the monthly table when the daily table is divided in the future, so as to avoid leaving a pit for the future
        $hasTable = Cache::get('has_' . $this->table, "0"); // Cache the current table
        if ($hasTable == "0") { // If no cache exists
            if (!Schema::hasTable($this->table)) { // If no table exists, create a table
                DB::update('create table ' . DB::getTablePrefix() . $this->table . ' like ' . DB::getTablePrefix() . 'video_time_info_details');
            }
            Cache::set('has_' . $this->table, "1");
        }
    }
}

In this way, we can save, update and insert the model normally, and the table will be divided automatically

2. Query of model

After document research, we are going to use laravel's globalScope to monitor all target models. For business processes that need to query by table, we add a makeUnionQuery method to return the queryBuilder of the model, and leave a special ID for the table name of the sql, so that we can monitor this ID in our own encapsulated globalScope, Then reassemble the sql and where conditions to construct conditional union statements.

(1) Processing of model

Add a global scope to the boot method of model. The code is as follows:

    public static function boot()
    {
        parent::boot();

        static::addGlobalScope(new SplitTableScope); // The method is shown below
    }

Add a makeUnionQuery method to the model to replace the original query method, so that the previous VideoTimeInfoDetail::query() can be replaced by VideoTimeInfoDetail::makeUnionQuery(). The code is as follows:

    /**
     * Construct UNION table
     */
    public static function makeUnionQuery($startTime,$endTime = null)
    {
        $queryList = static::getSubTablesByMonth($startTime,$endTime); // The table to query is shown below
        // Construct join table union
        $queries = collect();
        // Start cycle
        $unionTable = $queries->push("video_time_info_details");
        foreach ($queryList as $suffix) {
            $tempTable = 'video_time_info_details_' . $suffix;
            $hasTable = Cache::get('has_' . $tempTable, "0");
            if ($hasTable == "0") {
                if (!Schema::hasTable($tempTable)) {
                    DB::update('create table ' . DB::getTablePrefix() . $tempTable . ' like ' . DB::getTablePrefix() . 'video_time_info_details');
                }
                Cache::set('has_' . $tempTable, "1");
            }

            if ($hasTable == "1") {
                $queries->push($tempTable);
            }
        }

        $self = new self; // Return model instead of raw query builder
        $unionTableSql = implode('{SPLIT_TABLE_FLAG}', $unionTable->toArray());

        return $self->setTable(DB::raw("{SPLIT_TABLE}$unionTableSql{SPLIT_TABLE}"));
    }

The getSubTablesByMonth method is an encapsulated method to obtain the table name based on the number of months from the specified start date to the target end date. The code is as follows

/**
     * Monthly table
     */
    public static function getSubTablesByMonth($startTime, $endTime = null)
    {
        $endTime = empty($endTime) ? time() : $endTime;
        if ($endTime instanceof \Illuminate\Support\Carbon) {
            $endTime = $endTime->timestamp;
        }
        $now = strtotime(date("Y-m-1", $endTime)); // Timestamp of current number 1
        $indexTime = empty($startTime) ? time() : $startTime;
        if ($indexTime instanceof \Illuminate\Support\Carbon) {
            $indexTime = $indexTime->timestamp;
        }
        $indexTime = strtotime(date("Y-m-1", $indexTime)); // Timestamp 1 of the start time
        $queryList = [];
        while ($indexTime <= $now) {
            $queryList[] = date('Y_m', $indexTime) . "_1";
            $indexTime = strtotime("+1 month", $indexTime);
        }
        $queryList = array_unique($queryList); // Table to query
        return $queryList;
    }

Next, create a class SplitTableScope to monitor and replace the results before executing mysql. The code is as follows

namespace App\Scopes;

use Illuminate\Database\Eloquent\Scope;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Database\Eloquent\Builder;
use Illuminate\Database\Eloquent\ScopeInterface;
use Illuminate\Support\Facades\DB;

class SplitTableScope implements Scope
{
    /**
     * Apply the scope to a given Eloquent query builder.
     *
     * @param  \Illuminate\Database\Eloquent\Builder  $builder
     * @param  \Illuminate\Database\Eloquent\Model  $model
     * @return void
     */
    public function apply(Builder $builder, Model $model)
    {
        $query = $builder->getQuery();
        $from = $query->from;
        $query->from = null;
        if (strpos($from, "{SPLIT_TABLE}") === false) {
            return;
        }
        $splitFrom = str_replace("{SPLIT_TABLE}", "", $from);
        $splitTables = explode("{SPLIT_TABLE_FLAG}", $splitFrom);

        // Processing where
        $wheres = (array)$query->wheres;

        $columns = empty($query->columns) ? "*" : $query->columns;
        $bindings = empty($query->bindings) ? [] : $query->bindings;
        $myBindings = []; // Record bindings, otherwise it will appear later? Inconsistent with parameters
        // Just spell it
        $queries = collect();
        foreach ($splitTables as $table) {
            $tempDb = DB::table($table);
            $tempBindings = [];
            foreach ($wheres as $key => $where) {
                $tempDb->where($where['column'], $where['operator'], $where['value'], $where['boolean']);
                $tempBindings[] = $where['value'];
            }
            $myBindings = array_merge($myBindings, $tempBindings);
            $queries->push($tempDb->select($columns));
        }

        $firstQuery = $queries->shift();

        $queries->each(function ($item, $key) use ($firstQuery) {
            $firstQuery->unionAll($item);
        });

        $bindings = array_merge($myBindings, $bindings);
        $query->bindings = $bindings;
        $query->from = DB::raw("({$firstQuery->toSql()}) as video_time_info_details_all");

    }
}

In the above, we have replaced the from of our original sql, and spliced the bindings on the original basis. Then, the where and limit executed will be spliced automatically. The new sql will go through the index, and the speed will be greatly improved. The specific calling methods are as follows

VideoTimeInfoDetails::makeUnionQuery($LearnTime->created_at,$LearnTime->updated_at)->where('video_time_info_id',$LearnTime['id'])->orderByDesc('start_play_time')->first();

0x02 production status

Because the other end uses the go language, the sub table query of go will be described later. Take a look at the number of table rows produced now:

1. Table quantity


Because the sub table is used in October, in order to prevent problems in the previous month, the previous date will also be created according to the time when it is created

2. Main table quantity


Sub table at 18 million

3.10 monthly table quantity


It has reached 4 million, and it has to be subdivided later

0x03 conclusion

Because the detail s of this project must be saved for 3 years, I didn't expect that the data volume would grow so fast. If I had known so, I should have saved them all in log databases such as mongodb. But fortunately, Party A's father has the ability to cash, so he can withstand the pressure~

Posted by -entropyman on Wed, 27 Oct 2021 09:20:17 -0700