Compression and archiving on Linux

Keywords: Linux network Unix Windows

Very, very, very simple description.

compress

The simple principle of compression is to replace the storage space on disk with the calculation time of CPU by some algorithms. At the same time, the bandwidth in network transmission can be saved.

For text files compression effect is better, for binary programs, pictures and other files compression effect is very poor.

compress, uncompress

File name: *.Z

Old compression tools are no longer used.

gzip, gunzip, zcat

File name: *.gz

gzip is used to compress files, brief grammar.

# gzip [OPTION] FILE...

FILE is a file name to be compressed, which can be multiple.

[root@C7 tmp]# ls -l
total 364
-rw-r--r--. 1 root root  18104 Jan  3  2018 functions
-rw-r--r--  1 root root  22384 Apr 17 14:20 lvm_mount_point.xfsdump
-rw-------  1 root root 327049 Apr 23 16:20 messages
[root@C7 tmp]# gzip functions lvm_mount_point.xfsdump messages 
[root@C7 tmp]# ls -l
total 64
-rw-r--r-- 1 root root  5021 Jan  3  2018 functions.gz
-rw-r--r-- 1 root root   769 Apr 17 14:20 lvm_mount_point.xfsdump.gz
-rw------- 1 root root 50126 Apr 23 16:20 messages.gz

After compression, the source file is automatically deleted and the. gz name compressed file is generated.

- d: For decompression. Equivalent to the gunzip command.

[root@C7 tmp]# gzip -d functions.gz
[root@C7 tmp]# gunzip functions.gz

Similarly, after decompression, the compressed file disappears and the source file is generated before compression.

-# fast --best: "#" denotes a number with a value of 1 to 9, indicating the effect of compression. 1 means fastest -- fast, 9 means best. The default is 6.

- c: Represents that the compressed data stream is exported to STDOUT without modifying the source file. Combined with redirection, the source file can be compressed without deleting.

[root@C7 tmp]# ls -l functions*
-rw-r--r-- 1 root root 18104 Jan  3  2018 functions
[root@C7 tmp]# gzip -c functions > functions.gz
[root@C7 tmp]# ls -l functions*
-rw-r--r-- 1 root root 18104 Jan  3  2018 functions
-rw-r--r-- 1 root root  5021 Apr 23 16:51 functions.gz

For compressed text files, it is impossible to view them directly using text viewers such as cat. If you want to view them based on compressed files, you can use zcat.

[root@C7 tmp]# zcat functions.gz | tail
              "x$1" = xcondrestart ] ; then

        systemctl_redirect $0 $1
        exit $?
    fi
fi

strstr "$(cat /proc/cmdline)" "rc.debug" && set -x
return 0

bzip2, bunzip2, bzcat

File name: *. bz2

Command function and usage are equivalent to gzip series.

In gzip, if you want to keep the source file when compressing, you need to combine redirection with - c option, while in bzip2, you can use - k option directly.

- k --keep: When compressing, do not delete the source file.

[root@C7 tmp]# bzip2 -k lvm_mount_point.xfsdump
[root@C7 tmp]# ls -l lvm_mount_point.xfsdump*
-rw-r--r-- 1 root root 22384 Apr 17 14:20 lvm_mount_point.xfsdump
-rw-r--r-- 1 root root   764 Apr 17 14:20 lvm_mount_point.xfsdump.bz2

xz, unxz, xzcat

File name: *. xz

Command function and usage are equivalent to bzip2 series. Contains the - k option.

Compression summary

Compression effect: XZ > bzip2 > gzip.

The actual/internal format of the compressed file can be viewed through the file command. It is not possible to judge the type of a file only by its extension, which is only recognized by the user.

[root@C7 tmp]# file functions.gz lvm_mount_point.xfsdump.bz2 messages.xz 
functions.gz:                gzip compressed data, was "functions", from Unix, last modified: Wed Jan  3 00:29:40 2018
lvm_mount_point.xfsdump.bz2: bzip2 compressed data, block size = 900k
messages.xz:                 XZ compressed data

 

File

The above-mentioned compression commands can not be implemented to merge multiple files into a compressed file, or to compress directories.

[root@C7 tmp]# file init.d/
init.d/: directory
[root@C7 tmp]# gzip init.d/
gzip: init.d/ is a directory -- ignored
[root@C7 tmp]# bzip2 init.d/
bzip2: Input file init.d/ is a directory.
[root@C7 tmp]# xz init.d/
xz: init.d/: Is a directory, skipping

To achieve this requirement, you need to archive.

The function of archiving is to merge multiple files / directories into one archive, which is similar to the packaging operation of WinRAR on Windows.

There are two kinds of archiving tools, cpio and tar. The former is a relatively old command, which has been gradually replaced by the latter. Therefore, this article briefly describes the use of the latter.

Create files

File name: *.tar,*.tar.gz,*.tar.bz2,*.tar.xz

# tar -c[zjJ]f ARCH_FILE FILE...

- c: Represents the creation of a file.

- f: Specify the file name (ARCH_FILE), which must be followed by the file name, otherwise an error will be reported. For example, "-fc" will report an error.

FILE: Files/directories to be filed.

When archiving, you can specify compression. tar's compression is essentially a call to the compression tools mentioned above, rather than its own compression capabilities.

- z: Use gzip compression.

- j: Compression using bzip2.

- J: Use xz compression.

[root@C7 tmp]# tar -czf test.tar.gz functions init.d/ lvm_mount_point.xfsdump messages 
[root@C7 tmp]# file test.tar.gz 
test.tar.gz: gzip compressed data, from Unix, last modified: Tue Apr 23 17:48:25 2019

The archiving operation does not delete the source file. Therefore, after archiving compression, the source file "functions init.d/ lvm_mount_point.xfsdump messages" still exists.

Expand files

# tar -xf ARCH_FILE [-C EXTRACT_DIR]

Tar expands the archive and automatically overwrites existing files. Without specifying the original compression format, tar will automatically recognize.

[root@C7 tmp]# rm -rf functions lvm_mount_point.xfsdump messages init.d/
[root@C7 tmp]# tar -xf test.tar.gz
[root@C7 tmp]# ls -ld functions lvm_mount_point.xfsdump messages init.d/
-rw-r--r-- 1 root root  18104 Jan  3  2018 functions
drwxr-xr-x 2 root root     70 Apr 23 17:31 init.d/
-rw-r--r-- 1 root root  22384 Apr 17 14:20 lvm_mount_point.xfsdump
-rw------- 1 root root 327049 Apr 23 16:20 messages

- C EXTRACT_DIR: Used to specify the path of archive expansion, if not specified by default, in the current directory.

[root@C7 tmp]# mkdir new_dir
[root@C7 tmp]# tar -xf test.tar.gz -C new_dir/
[root@C7 tmp]# ls -l new_dir/
total 364
-rw-r--r-- 1 root root  18104 Jan  3  2018 functions
drwxr-xr-x 2 root root     70 Apr 23 17:31 init.d
-rw-r--r-- 1 root root  22384 Apr 17 14:20 lvm_mount_point.xfsdump
-rw------- 1 root root 327049 Apr 23 16:20 messages

View files

# tar -tf ARCH_FILE
[root@C7 tmp]# tar -tf test.tar.gz 
functions
init.d/
init.d/README
init.d/functions
init.d/netconsole
init.d/network
lvm_mount_point.xfsdump
messages

Posted by rea|and on Tue, 23 Apr 2019 19:39:38 -0700