The previous chapter tested the serial backup and recovery of KADB, pg_dump/pg_restore. This chapter tests the parallel backup and recovery gpbackup/gprestore of KADB.
Because pg_dump and pg_restore is limited by the master node and cannot meet the backup and recovery of a large amount of data. Therefore, KADB provides gpbackup and gprestore parallel backup and recovery tools.
gpbackup Specific parameter descriptions can be implemented through
gpbackup -- help get
--dbname database_name Specify the database to back up.
--backup-dir directory copies all backup files (metadata files and data files) to the specified path.
--compression-level level Specifies the gzip compression level (from 1 to 9) used to compress data files. The default is 1. Note that gpbackup Compression is used by default.
--data-only Only the table data is backed up to the CSV file, and the metadata file used to rebuild the table and other database objects is not backed up.
--debug Displays debugging information during operation.
--exclude-schema schema_name Specifies the name of the mode to be excluded from backup. This parameter can be specified multiple times to exclude multiple modes.
--exclude-table schema.table specifies the tables to exclude from the backup.
--exclude-table-file file_name specifies a list of files that need to be excluded from the backup.
--include-schema schema_name specifies the name of the database schema to include in the backup.
--include-table schema.table specifies a table to include in the backup.
--include-table-file file_name Specify a table name list file required for backup.
--incremental Specify this option to add an incremental backup to the incremental backup set.
--from-timestamp Backup timestamp specifies the backup timestamp.
--jobs int Specifies the number of parallel tasks for table backup. The default gpbackup uses one task (database connection). Increasing the value of this parameter can improve the speed of data backup.
--leaf-partition-data For partitioned tables, when this option is defined, a separate data file will be created for each sub partition instead of a completed file for the entire table (default)
--metadata-only Only the metadata file (DDL) that can be used to rebuild the database is created, and the actual data of the table is not backed up.
--no-compression Table data CSV files are not compressed.
--plugin-config config-file_location specifies the location of the gpbackup plug-in configuration file, which is a text file in YAML format.
--quiet Do not display all non alarm and non error log information.
--single-data-file Create a separate data file for all backed up tables on each Segment host.
--verbose Print detailed log information.
--version Print the tool version number and exit.
--with-stats Include query plan statistics in the backup set.
--Help displays online help information.
Testing gpbackup backups
- Create backup database
drop database if exists restore; create database restore;
- Log in to the backup library and create a backup table
CREATE TABLE t1(i int, j int) with(appendonly=true,orientation=row); INSERT INTO t1 SELECT a , a from generate_series(1,10000) as a;
- Perform backup
[mppadmin@mdw ~]$ gpbackup --dbname restore --leaf-partition-data --backup-dir /home/mppadmin/dbbackup/ 20211011:13:51:39 gpbackup:mppadmin:mdw:004497-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Starting backup of database restore 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup Timestamp = 20211011135139 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup Database = restore 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Gathering table state information 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Acquiring ACCESS SHARE locks on tables Locks acquired: 1 / 1 [================================================================] 100.00% 0s 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Gathering additional table metadata 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting partition definitions 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting storage information 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Getting child partitions with altered schema 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Metadata will be written to /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gpbackup_20211011135139_metadata.sql 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing global database metadata 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Global database metadata backup complete 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing pre-data metadata 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Pre-data metadata metadata backup complete 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing post-data metadata 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Post-data metadata backup complete 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Writing data to file Tables backed up: 1 / 1 [==============================================================] 100.00% 0s 20211011:13:51:40 gpbackup:mppadmin:mdw:004497-[INFO]:-Data backup complete 20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml 20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Email containing gpbackup report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gpbackup_20211011135139_report will not be sent 20211011:13:51:41 gpbackup:mppadmin:mdw:004497-[INFO]:-Backup completed successfully
Backup succeeded. View the backup file
gprestore restore command
Specific parameter descriptions can be implemented through
gprestore --help get
--timestamp YYYYMMDDHHMMSS Specify a timestamp to recover the gpbackup backup set.
--backup-dir directory Find all backup files (metadata files and data files) from the specified path.
--create-db Create a database before restoring database object metadata.
--data-only Only the table data created by gpbackup is restored, and the operation of creating database tables is not restored.
--debug Displays details and debug logs during the operation.
--exclude-schema schema_name Specifies the database schema to exclude during the recovery operation.
--exclude-table schema.table specifies the tables to exclude during the recovery operation.
--exclude-table-file file_name specifies the list file of tables to exclude during recovery.
--include-schema schema_name specifies the database schema to restore.
--include-table schema.table specifies a table to recover.
--include-table-file file_name specifies a list file containing the name of the recovery table.
--jobs int specifies the number of parallel connections to recover table data and metadata. The default gprestore uses 1 connection. Increasing this parameter will improve the speed of data recovery.
--metadata-only Create database tables from a backup set created by gpbackup without restoring data.
--on-error-continue Specify this option to ignore SQL errors and continue execution when creating database metadata (such as tables, objects, or functions) or recovering data. If there are other types of errors, the tool exits. The tool displays an error message summary and writes the error message to the gprestore the log file and continue the recovery operation.
The default is to exit the first time an error occurs.
--plugin-config config-file_location Specify the location of the gpbackup plug-in configuration file, which is a text file in YAML format.
--quiet Prohibit any non alarm and non error log output.
--redirect-db database_name restores to the specified database_name instead of the default database name at the time of backup.
--verbose Displays detailed log information during the recovery operation.
--version Print the version number and exit.
--with-globals Restore the KADB database system objects and other database objects in the backup set.
--with-stats Restore query schedule statistics from the backup collection.
--help Displays online help information.
Test gprestore to recover database tables
- Delete the backup database and execute the restore command.
[mppadmin@mdw ~]$ psql Timing is on. psql (9.4.24) Type "help" for help. test=# drop database restore ; DROP DATABASE Time: 298.357 ms test=# \q [mppadmin@mdw ~]$ gprestore --timestamp 20211011135139 --create-db --backup-dir /home/mppadmin/dbbackup/ 20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Restore Key = 20211011135139 20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd 20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-gprestore version = 1.12.1+dev.95.g18c6fdd 20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48 20211011:14:00:32 gprestore:mppadmin:mdw:006638-[INFO]:-Creating database 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Database creation complete for: restore 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restoring pre-data metadata Pre-data objects restored: 6 / 6 [=====================================================] 100.00% 0s 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Pre-data metadata restore complete Tables restored: 1 / 1 [===============================================================] 100.00% 0s 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Data restore complete 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restoring post-data metadata 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Post-data metadata restore complete 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Email containing gprestore report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011135139/gprestore_20211011135139_20211011140032_report will not be sent 20211011:14:00:34 gprestore:mppadmin:mdw:006638-[INFO]:-Restore completed successfully [mppadmin@mdw ~]$
- View recovery results
The database and tables were recovered.
Gpbackup and gprestore are parallel backup tools provided by KADB. All nodes will back up their data locally and recover from the local during recovery. In this way, it will not be limited by the master of the primary node, so as to improve the speed of backup and recovery.. At the same time, gpbackup and gprestore also support incremental backup of AO tables. The premise of incremental backup is to have a full backup. Use the full backup above to test incremental backup.
The specific backup methods are as follows:
- Insert 1000 pieces of data into the backup table.
- Perform incremental backup
[mppadmin@mdw ~]$ gpbackup --dbname restore --leaf-partition-data --backup-dir /home/mppadmin/dbbackup/ --incremental 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Starting backup of database restore 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup Timestamp = 20211011141517 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup Database = restore 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Gathering table state information 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Acquiring ACCESS SHARE locks on tables Locks acquired: 1 / 1 [================================================================] 100.00% 0s 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Gathering additional table metadata 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting partition definitions 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting storage information 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Getting child partitions with altered schema 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Metadata will be written to /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gpbackup_20211011141517_metadata.sql 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing global database metadata 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Global database metadata backup complete 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing pre-data metadata 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Pre-data metadata metadata backup complete 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing post-data metadata 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Post-data metadata backup complete 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Basing incremental backup off of backup with timestamp = 20211011135139 20211011:14:15:17 gpbackup:mppadmin:mdw:010109-[INFO]:-Writing data to file Tables backed up: 1 / 1 [==============================================================] 100.00% 0s 20211011:14:15:18 gpbackup:mppadmin:mdw:010109-[INFO]:-Data backup complete 20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml 20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Email containing gpbackup report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gpbackup_20211011141517_report will not be sent 20211011:14:15:19 gpbackup:mppadmin:mdw:010109-[INFO]:-Backup completed successfully [mppadmin@mdw ~]$
- View backup files
20211011141517 is the incremental backup file
- Delete the backup library for recovery
[mppadmin@mdw ~]$ psql Timing is on. psql (9.4.24) Type "help" for help. test=# drop database restore ; DROP DATABASE Time: 285.860 ms test=# \q [mppadmin@mdw ~]$ gprestore --timestamp 20211011141517 --create-db --backup-dir /home/mppadmin/dbbackup/ 20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Restore Key = 20211011141517 20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-gpbackup version = 1.12.1+dev.95.g18c6fdd 20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-gprestore version = 1.12.1+dev.95.g18c6fdd 20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Greenplum Database Version = 6.0.0 build dev.V003R002C001B0110.befc7c7b48 20211011:14:22:27 gprestore:mppadmin:mdw:011798-[INFO]:-Creating database 20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Database creation complete for: restore 20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Restoring pre-data metadata Pre-data objects restored: 6 / 6 [=====================================================] 100.00% 0s 20211011:14:22:28 gprestore:mppadmin:mdw:011798-[INFO]:-Pre-data metadata restore complete Tables restored: 1 / 1 [===============================================================] 100.00% 0s 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Data restore complete 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Restoring post-data metadata 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Post-data metadata restore complete 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Found neither /home/mppadmin/mpp/bin/gp_email_contacts.yaml nor /home/mppadmin/gp_email_contacts.yaml 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Email containing gprestore report /home/mppadmin/dbbackup/mppseg-1/backups/20211011/20211011141517/gprestore_20211011141517_20211011142227_report will not be sent 20211011:14:22:29 gprestore:mppadmin:mdw:011798-[INFO]:-Restore completed successfully [mppadmin@mdw ~]$
- View recovery results
From the recovery results, the deleted backup library is recovered normally. Incremental backup greatly reduces the backup time, but currently gpbackup only supports incremental backup of AO tables.
At present, the various backup methods provided by KADB can meet the backup needs of users. Of course, there are also places to be improved. I hope the backup of KADB will be more perfect and provide better services for customers.