Dazzle "library" action - award solicitation of Jincang of the National People's Congress - backup and recovery of Jincang analytical database KADB - parallel backup and recovery

Keywords: Database Big Data PostgreSQL

The previous chapter tested the serial backup and recovery of KADB, pg_dump/pg_restore. This chapter tests the parallel backup and recovery gpbackup/gprestore of KADB.

Because pg_dump and pg_restore is limited by the master node and cannot meet the backup and recovery of a large amount of data. Therefore, KADB provides gpbackup and gprestore parallel backup and recovery tools.

  gpbackup   Specific parameter descriptions can be implemented through
  gpbackup  -- help get

--dbname   database_name   Specify the database to back up.

--backup-dir   directory copies all backup files (metadata files and data files) to the specified path.

--compression-level   level   Specifies the gzip compression level (from 1 to 9) used to compress data files. The default is 1. Note that gpbackup   Compression is used by default.

--data-only   Only the table data is backed up to the CSV file, and the metadata file used to rebuild the table and other database objects is not backed up.

--debug   Displays debugging information during operation.

--exclude-schema   schema_name   Specifies the name of the mode to be excluded from backup. This parameter can be specified multiple times to exclude multiple modes.

--exclude-table   schema.table specifies the tables to exclude from the backup.

--exclude-table-file   file_name specifies a list of files that need to be excluded from the backup.

--include-schema   schema_name specifies the name of the database schema to include in the backup.

--include-table   schema.table specifies a table to include in the backup.

--include-table-file   file_name   Specify a table name list file required for backup.

--incremental   Specify this option to add an incremental backup to the incremental backup set.

--from-timestamp   Backup timestamp specifies the backup timestamp.

--jobs   int   Specifies the number of parallel tasks for table backup. The default gpbackup uses one task (database connection). Increasing the value of this parameter can improve the speed of data backup.

--leaf-partition-data   For partitioned tables, when this option is defined, a separate data file will be created for each sub partition instead of a completed file for the entire table (default)

--metadata-only   Only the metadata file (DDL) that can be used to rebuild the database is created, and the actual data of the table is not backed up.

--no-compression   Table data CSV files are not compressed.

--plugin-config   config-file_location specifies the location of the gpbackup plug-in configuration file, which is a text file in YAML format.

--quiet   Do not display all non alarm and non error log information.

--single-data-file   Create a separate data file for all backed up tables on each Segment host.

--verbose   Print detailed log information.

--version   Print the tool version number and exit.

--with-stats   Include query plan statistics in the backup set.

--Help displays online help information.

  Testing gpbackup backups

  • Create backup database
drop database if exists restore;
create database restore;

  • Log in to the backup library and create a backup table
CREATE TABLE t1(i int, j int) with(appendonly=true,orientation=row);
INSERT INTO t1 SELECT a , a from generate_series(1,10000) as a;

  •   Perform backup
[mppadmin@mdw ~]$ gpbackup  --dbname restore --leaf-partition-data  --backup-dir  /home/mppadmin/dbbackup/
20211011:13:51:39 gpbackup:mppadmin:mdw:004497-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Starting backup of database restore
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup Timestamp = 20211011135139
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup Database = restore
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Gathering table state information
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Acquiring ACCESS SHARE locks on tables
Locks acquired:  1 / 1 [================================================================] 100.00% 0s
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Gathering additional table metadata
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting partition definitions
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting storage information
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting child partitions with altered schema
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Metadata will be written to /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gpbackup_20211011135139_metadata.sql
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing global database metadata
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Global database metadata backup complete
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing pre-data metadata
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Pre-data metadata metadata backup complete
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing post-data metadata
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Post-data metadata backup complete
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing data to file
Tables backed up:  1 / 1 [==============================================================] 100.00% 0s
20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Data backup complete
20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml
20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Email containing gpbackup report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gpbackup_20211011135139_report will not be sent
20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup completed successfully

Backup succeeded. View the backup file  

gprestore restore command

Specific parameter descriptions can be implemented through
  gprestore --help get

--timestamp   YYYYMMDDHHMMSS   Specify a timestamp to recover the gpbackup backup set.

--backup-dir   directory   Find all backup files (metadata files and data files) from the specified path.

--create-db   Create a database before restoring database object metadata.

--data-only   Only the table data created by gpbackup is restored, and the operation of creating database tables is not restored.

--debug   Displays details and debug logs during the operation.

--exclude-schema   schema_name   Specifies the database schema to exclude during the recovery operation.

--exclude-table   schema.table specifies the tables to exclude during the recovery operation.

--exclude-table-file   file_name specifies the list file of tables to exclude during recovery.

--include-schema   schema_name specifies the database schema to restore.

--include-table   schema.table specifies a table to recover.

--include-table-file   file_name specifies a list file containing the name of the recovery table.

--jobs   int specifies the number of parallel connections to recover table data and metadata. The default gprestore uses 1 connection. Increasing this parameter will improve the speed of data recovery.

--metadata-only   Create database tables from a backup set created by gpbackup without restoring data.

--on-error-continue   Specify this option to ignore SQL errors and continue execution when creating database metadata (such as tables, objects, or functions) or recovering data. If there are other types of errors, the tool exits. The tool displays an error message summary and writes the error message to the   gprestore the log file and continue the recovery operation.

The default is to exit the first time an error occurs.

--plugin-config   config-file_location   Specify the location of the gpbackup plug-in configuration file, which is a text file in YAML format.

--quiet   Prohibit any non alarm and non error log output.

--redirect-db   database_name restores to the specified database_name instead of the default database name at the time of backup.

--verbose   Displays detailed log information during the recovery operation.

--version   Print the version number and exit.

--with-globals   Restore the KADB database system objects and other database objects in the backup set.

--with-stats   Restore query schedule statistics from the backup collection.

--help   Displays online help information.

  Test gprestore to recover database tables

  • Delete the backup database and execute the restore command.
[mppadmin@mdw ~]$ psql
Timing is on.
psql (9.4.24)
Type "help" for help.

test=# drop database restore ;
DROP DATABASE
Time: 298.357 ms
test=# \q
[mppadmin@mdw ~]$ gprestore --timestamp 20211011135139 --create-db --backup-dir /home/mppadmin/dbbackup/
20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Restore Key = 20211011135139
20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd
20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-gprestore version = 1.12.1+dev.95.g18c6fdd
20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48
20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Creating database
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Database creation complete for: restore
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restoring pre-data metadata
Pre-data objects restored:  6 / 6 [=====================================================] 100.00% 0s
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Pre-data metadata restore complete
Tables restored:  1 / 1 [===============================================================] 100.00% 0s
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Data restore complete
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restoring post-data metadata
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Post-data metadata restore complete
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Email containing gprestore report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gprestore_20211011135139_20211011140032_report will not be sent
20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restore completed successfully
[mppadmin@mdw ~]$
  • View recovery results

The database and tables were recovered.

Gpbackup and gprestore are parallel backup tools provided by KADB. All nodes will back up their data locally and recover from the local during recovery. In this way, it will not be limited by the master of the primary node, so as to improve the speed of backup and recovery.. At the same time, gpbackup and gprestore also support incremental backup of AO tables. The premise of incremental backup is to have a full backup. Use the full backup above to test incremental backup.

The specific backup methods are as follows:

  • Insert 1000 pieces of data into the backup table.

  • Perform incremental backup
[mppadmin@mdw ~]$ gpbackup  --dbname restore --leaf-partition-data  --backup-dir  /home/mppadmin/dbbackup/ --incremental
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Starting backup of database restore
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup Timestamp = 20211011141517
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup Database = restore
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Gathering table state information
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Acquiring ACCESS SHARE locks on tables
Locks acquired:  1 / 1 [================================================================] 100.00% 0s
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Gathering additional table metadata
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting partition definitions
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting storage information
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting child partitions with altered schema
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Metadata will be written to /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gpbackup_20211011141517_metadata.sql
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing global database metadata
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Global database metadata backup complete
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing pre-data metadata
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Pre-data metadata metadata backup complete
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing post-data metadata
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Post-data metadata backup complete
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Basing incremental backup off of backup with timestamp = 20211011135139
20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing data to file
Tables backed up:  1 / 1 [==============================================================] 100.00% 0s
20211011:14:15:18 gpbackup:mppadmin:mdw:010109-[INFO]:-Data backup complete
20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml
20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Email containing gpbackup report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gpbackup_20211011141517_report will not be sent
20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup completed successfully
[mppadmin@mdw ~]$
  •   View backup files

  20211011141517 is the incremental backup file

  • Delete the backup library for recovery
[mppadmin@mdw ~]$ psql
Timing is on.
psql (9.4.24)
Type "help" for help.

test=# drop database restore ;
DROP DATABASE
Time: 285.860 ms
test=# \q
[mppadmin@mdw ~]$ gprestore --timestamp 20211011141517 --create-db --backup-dir /home/mppadmin/dbbackup/
20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Restore Key = 20211011141517
20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd
20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-gprestore version = 1.12.1+dev.95.g18c6fdd
20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48
20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Creating database
20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Database creation complete for: restore
20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Restoring pre-data metadata
Pre-data objects restored:  6 / 6 [=====================================================] 100.00% 0s
20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Pre-data metadata restore complete
Tables restored:  1 / 1 [===============================================================] 100.00% 0s
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Data restore complete
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Restoring post-data metadata
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Post-data metadata restore complete
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Email containing gprestore report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gprestore_20211011141517_20211011142227_report will not be sent
20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Restore completed successfully
[mppadmin@mdw ~]$
  • View recovery results

  From the recovery results, the deleted backup library is recovered normally. Incremental backup greatly reduces the backup time, but currently gpbackup only supports incremental backup of AO tables.

At present, the various backup methods provided by KADB can meet the backup needs of users. Of course, there are also places to be improved. I hope the backup of KADB will be more perfect and provide better services for customers.

Posted by tbeinc on Mon, 11 Oct 2021 20:41:01 -0700