Based on Python's file, directory and path operations, we usually use os.path modular.
pathlib is its replacement in os.path On the basis of the encapsulation, the path is objectified, the api is more popular, the operation is more convenient, more in line with the thinking habits of programming.
The pathlib module provides classes that use semantics to represent file system paths, which are suitable for a variety of operating systems. The path class is divided into a pure path (which provides pure computation without I/O) and a specific path (inherited from the pure path, but provides I/O operations).
First, let's look at the organization structure of pathlib module. Its core is six classes. The base class of these six classes is PurePath class, and the other five classes are derived from it:
Arrowheads connect two classes with inheritance relationship. Take PurePosixPath and PurePath classes for example. PurePosixPath inherits from PurePath, that is, the former is a subclass of the latter.
- PurePath class: the path is regarded as a common string. It can be used to splice multiple specified strings into a path format suitable for the current operating system. At the same time, it can also judge whether any two paths are equal. From the English name, Pure means Pure, which means that PurePath class only cares about path operation, regardless of whether the path in the real file system is valid, whether the file exists, whether the directory exists and other practical problems.
- PurePosixPath and PureWindowsPath are subclasses of PurePath. The former is used to operate UNIX (including Mac OS X) style operating system paths, and the latter is used to operate Windows operating system paths. We all know that there are some differences in path separators between the two styles of operating systems.
- The Path class is different from the above three classes. While operating the Path, it can also operate the file / directory and interact with the real file system, for example, to determine whether the Path really exists.
- PosixPath and WindowsPath are subclasses of Path, which are used to operate Unix (Mac OS X) style paths and Windows style paths respectively.
PurePath, PurePosixPath, and PureWindowsPath are three pure path classes commonly used in some special situations, such as:
-
If you need to operate windows path in UNIX device, or UNIX path in windows device. Because we can't instantiate a real Windows path on Unix, but we can instantiate a pure Windows path and pretend we're operating windows.
-
You want to make sure that your code only operates on paths and does not interact with the operating system.
Popular science: on UNIX type operating system and Windows operating system, the path format is totally different. The main difference lies in the root path and path separator. The root path of UNIX system is slash (/), while the root path of Windows system is drive (C:); the separator used for UNIX system path is forward slash (/), while the backslash (\) is used for Windows.
1, PurePath class
PurePath class (as well as PurePosixPath class and PureWindowsPath class) provides a large number of construction methods, instance methods and class instance properties for us to use.
When the PurePath class is instantiated, the operating system is automatically adapted. If in UNIX or Mac OS X system, the constructor actually returns PurePosixPath object; otherwise, if using PurePath to create an instance on Windows system, the constructor returns PureWindowsPath object.
For example, in a Windows system, execute the following statement:
from pathlib import PurePath path = PurePath('file.txt') print(type(path)) # <class 'pathlib.PureWindowsPath'>
PurePath also supports passing in multiple path strings when creating objects, which will be spliced into one path. For example:
from pathlib import PurePath path = PurePath('https:','www.liujiangblog.com','django') print(path) # https:\www.liujiangblog.com\django
As you can see, since the running environment is windows erasure system, the output is the path of Windows platform format.
If you want to create UNIX style paths in Windows, you need to specify the use of the PurePosixPath class, and vice versa. For example:
from pathlib import PurePosixPath path = PurePosixPath('https:','www.liujiangblog.com','django') print(path) # https:/www.liujiangblog.com/django
Emphasis: when doing pure path operation, it is playing with strings. It has no actual connection with the local file system and does not do any disk IO operation. The path constructed by PurePath is essentially a string, which can be converted to a string using str().
In addition, if no string parameter is passed in when using the construction method of PurePath class, etc. is equivalent to the passed in point. (current path) as the parameter:
from pathlib import PurePath path1 = PurePath() path2 = PurePath('.') print(path1 == path2) # True
If more than one parameter in the passed PurePath construction method contains more than one root path, only the last root path and subsequent child paths will take effect. For example:
from pathlib import PurePath path = PurePath('C:/', 'D:/', 'file.txt') print(path) # D:\file.txt
As an additional reminder, when constructing strings in Python, be sure to pay attention to the difference between forward / backward slashes when escaping and not escaping. And the use and non use of r native strings. Don't write wrong
If the parameters passed to the PurePath constructor contain extra slashes or. Will be ignored directly, but.. will not be ignored:
from pathlib import PurePath path = PurePath('C:/./..file.txt') print(path) # C:\..file.txt
PurePath instances support comparison operators. For paths of the same style, you can judge whether they are equal or compare sizes (in fact, comparing the size of strings). For paths of different styles, you can only judge whether they are equal (obviously, they can't be equal), but you can't compare sizes:
from pathlib import * # Unix style paths are case sensitive print(PurePosixPath('/D/file.txt') == PurePosixPath('/d/file.txt')) # Windows style paths are case insensitive print(PureWindowsPath('D://file.txt') == PureWindowsPath('d://file.txt')) # False # True
The common methods and properties of PurePath instances are listed below:
Instance properties and methods | Function description |
---|---|
PurePath.parts | Returns the parts contained in the path string. |
PurePath.drive | Returns the drive letter in the path string. |
PurePath.root | Returns the root path in the path string. |
PurePath.anchor | Returns the drive letter and root path in the path string. |
PurePath.parents | Returns all the parent paths of the current path. |
PurPath.parent | Returns the previous path of the current path, equivalent to the return value of parents[0]. |
PurePath.name | Returns the filename in the current path. |
PurePath.suffixes | Returns all suffix names of files in the current path. |
PurePath.suffix | Returns the file suffix in the current path. This is the last element of the list of sufficesses attributes. |
PurePath.stem | Returns the name of the main file in the current path. |
PurePath.as_posix() | Converts the current path to a UNIX style path. |
PurePath.as_uri() | Converts the current path to a URL. Only absolute paths can be converted, otherwise ValueError will be raised. |
PurePath.is_absolute() | Determine whether the current path is an absolute path. |
PurePath.joinpath(*other) | Connect multiple paths together, similar to the slash (/) connector described earlier. |
PurePath.match(pattern) | Determines whether the current path matches the specified wildcard. |
PurePath.relative_to(*other) | Gets the result after removing the reference path from the current path. |
PurePath.with_name(name) | Replace the filename in the current path with a new filename. ValueError is raised if there is no filename in the current path. |
PurePath.with_suffix(suffix) | Replace the file suffix in the current path with a new suffix. If there is no suffix in the current path, a new suffix is added. |
2, Path class
More often, we use the Path class directly, not PurePath.
Path is a subclass of PurePath. In addition to supporting various constructors, properties and methods provided by PurePath, it also provides methods to determine the validity of the path, and even to determine whether the path corresponds to a file or a folder. If it is a file, it also supports reading and writing files.
Path has two subclasses, PosixPath and WindowsPath. The function of these two subclasses is obvious and will not be discussed in detail.
Basic use
from pathlib import Path # Create instance p = Path('a','b','c/d') p = Path('/etc') ------------------------------------------------------- p = Path() # WindowsPath('.') p.resolve() # Analytic path, not necessarily real path # WindowsPath('C:/Users/liujiangblog') -------------------------------------------------- # Return to the current real absolute path at any time p.cwd() # WindowsPath('D:/work/2020/django3') Path.cwd() # WindowsPath('D:/work/2020/django3') p.home() # WindowsPath('C:/Users/liujiangblog') Path.home() # WindowsPath('C:/Users/liujiangblog')
Directory operation
p = Path(r'd:\test\11\22') p.mkdir(exist_ok=True) # Create file directory (if tt directory exists, otherwise an error will be reported) # In general, I will use the following creation method p.mkdir(exist_ok=True, parents=True) # Recursively create file directory p.rmdir() #Delete the current directory, but it must be empty p # WindowsPath('d:/test/11/22 ') p still exists
Traverse directory
p = Path(r'd:\test') # WindowsPath('d:/test') p.iterdir() # amount to os.listdir p.glob('*') # amount to os.listdir , but you can add matching criteria p.rglob('*') # amount to os.walk , you can also add matching criteria
create a file
file = Path(r'd:\test\11\22\test.py') file.touch() # The touch method is used to create an empty file. The directory must exist, otherwise it cannot be created #Traceback (most recent call last): # File "<input>", line 1, in <module> # ..... #FileNotFoundError: [Errno 2] No such file or directory: 'd:\\test\\11\\22\\test.py' p = Path(r'd:\test\11\22') p.mkdir(exist_ok=True,parents=True) file.touch()
File operation
p = Path(r'd:\test\tt.txt.bk') p.name # Get file name # tt.txt.bk p.stem # Get the part of file name except suffix # tt.txt p.suffix # file extension # .bk p.suffixs # File suffixes # ['.txt', '.bk'] p.parent # Equivalent to dirnanme # WindowsPath('d:/test') p.parents # Returns an iterable containing all the parent directories # <WindowsPath.parents> for i in p.parents: print(i) # d:\test # d:\ p.parts # Splits a path into tuples by delimiters # ('d:\\', 'test', 'tt.txt.bk') p = Path('C:/Users/Administrator/Desktop/') p.parent # WindowsPath('C:/Users/Administrator') p.parent.parent # WindowsPath('C:/Users') # Index 0 is the direct parent directory. The larger the index, the closer it is to the root directory for x in p.parents: print(x) # C:\Users\Administrator # C:\Users # C:\ # For more technical articles, please visit the official website https://www.liujiangblog.com # with_name(name) replaces the last part of the path and returns a new path Path("/home/liujiangblog/test.py").with_name('python.txt') # WindowsPath('/home/liujiangblog/python.txt') # with_suffix(suffix) replaces the extension and returns the new path. If the extension exists, it does not change Path("/home/liujiangblog/test.py").with_suffix('.txt') # WindowsPath('/home/liujiangblog/test.txt')
file information
p = Path(r'd:\test\tt.txt') p.stat() # Get details # os.stat_result(st_mode=33206, st_ino=562949953579011, st_dev=3870140380, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1525254557, st_mtime=1525254557, st_ctime=1525254557) p.stat().st_size # file size # 0 p.stat().st_ctime # Creation time # 1525254557.2090347 # Other information can be obtained in the same way p.stat().st_mtime # Modification time
File reading and writing
open(mode='r', bufferiong=-1, encoding=None, errors=None, newline=None)
The method is similar to Python's built-in open function, which returns a file object.
p = Path('C:/Users/Administrator/Desktop/text.txt') with p.open(encoding='utf-8') as f: print(f.readline())
read_bytes(): read the file in 'rb' mode and return data of bytes type
write_bytes(data): write data to file as' wb '
p = Path('C:/Users/Administrator/Desktop/text.txt') p.write_bytes(b'Binary file contents') # 20 p.read_bytes() # b'Binary file contents'
read_text(encoding=None, errors=None): read the path corresponding file in 'r' mode, and return the text
write_text(data, encoding=None, errors=None): write string to path corresponding file in 'w' mode
p = Path('C:/Users/Administrator/Desktop/text.txt') p.write_text('Text file contents') # 18 p.read_text() # 'Text file contents'
Judgment operation
Return Boolean
- is_dir(): directory or not
- is_file(): whether it is a normal file
- is_symlink(): whether it is a soft link
- is_socket(): whether it is a socket file
- is_block_device(): is it a block device
- is_char_device(): is it a character device
- is_absolute(): is it an absolute path
p = Path(r'd:\test') p = Path(p, 'test.txt') # String splicing p.exists() # Judge whether the file exists p.is_file() # Determine whether it is a document p.is_dir() # Determine whether it is a directory
Path splicing and decomposition
In pathlib, there are three main ways to splice paths by splicing operator /
- Path object / path object
- Path object / String
- String / Path object
Decomposition path is mainly through parts method
p=Path() p # WindowsPath('.') p = p / 'a' p # WindowsPath('a') p = 'b' / p p # WindowsPath('b/a') p2 = Path('c') p = p2 / p p # WindowsPath('c/b/a') p.parts # ('c', 'b', 'a') p.joinpath("c:","liujiangblog.com","jack") # When splicing, the front part is ignored # WindowsPath('c:liujiangblog.com/jack') # For more technical articles, please visit the official website https://www.liujiangblog.com
wildcard
- glob(pattern): the pattern of rationing
- rglob(pattern): the pattern assigned through allocation, and recursively search the directory
Return value: a generator
p=Path(r'd:\vue_learn') p.glob('*.html') # Match all HTML files and return a generator generator # <generator object Path.glob at 0x000002ECA2199F90> list(p.glob('*.html')) # [WindowsPath('d:/vue_learn/base.html'), WindowsPath('d:/vue_learn/components.html'), WindowsPath('d:/vue_learn/demo.html')......................... g = p.rglob('*.html') # Recursive matching next(g) # WindowsPath('d:/vue_learn/base.html') next(g) # WindowsPath('d:/vue_learn/components.html')
Regular matching
Use the match method for pattern matching, and return True if successful
p = Path('C:/Users/Administrator/Desktop/text.txt') p.match('*.txt') # True Path('C:/Users/Administrator/Desktop/text.txt').match('**/*.txt') # True
More examples
from pathlib import Path p1 = Path(__file__) #Get current file path #D:\liujiangblog\test1.py p2 = Path.cwd() #Get the directory of the current file #D:\test p3=Path.cwd().parent #Parent directory of the current file directory #D:\ p=Path.cwd().joinpath('a') #Path splicing #D:\test\a st=Path(__file__).stat() #Get information about the current file #os.stat_result(st_mode=33206, st_ino=6473924464701313, st_dev=1559383105, st_nlink=1, st_uid=0, st_gid=0, st_size=300, st_atime=1578661629, st_mtime=1578661629, st_ctime=1576891792) a=st.st_size #File size in bytes p=p1.parent #Parent path of p1 z=p1.parents #All ancestor paths of p, return an object # for i in z: # print(i) pp = Path('D:/python') #Create a path object a=pp.is_file() #Determine whether pp is a file a=pp.is_dir() #Determine whether pp is a directory a=p2.is_absolute() #Determine if p2 is an absolute path a=p2.match('d:\*') #Determine whether p2 conforms to a certain mode a=p2.glob('*.py') #Search for the first mock exam file in p2 mode -- search p2 directory only a=p3.glob('**\*.py') #Search for the first mock exam file in p3, including all subdirectories. # a=p3.rglob('*.py') #Search for the first mock exam file in p3, including all subdirectories. # for i in a: # print(i) # For more technical articles, please visit the official website https://www.liujiangblog.com #pp.mkdir() #Create directory -- throw an exception if it already exists a=p1.name #Get file name #test1.py a=p1.suffix #Get suffix #.py a=pp.stem #Last part of catalog without suffix a=pp.with_name('vocab.txt') #Replace the last part and return a new path a=p1.with_suffix('.lm') #Replace the extension and return the new path. If the extension exists, it will not change #D:\ss\test1.lm dir=Path('d:/') a=dir.iterdir() #Iterator of all file and folder paths -- only return the excluding subdirectories of this directory # for i in a: # print(i) file=Path('D:/ss.lm') #file.rename('d:/cc.txt') #Rename and move - both files and folders #Throw an exception if the file does not exist #Move must be on the same drive #Exception thrown when the target file already exists file.replace('d:/cc.txt') #Rename and move - both files and folders #Similar to rename, overwrite the original file when the target file or folder already exists