File Processing in Python
Basic operation of files
The absolute path of the file should be written; if the file is in the same directory as the code, the file name can be written directly.
About file encoding: What encoding to use, you must use what encoding to open, otherwise it may be scrambled.
# Create a file test.txt with the contents of: 11111111 22222222 33333333
- open() function: Open the file, apply for system call to the operating system, occupy file handle, return value is a file object
You can specify permissions and encoding formats for reading files
file = open('test.txt','r',encoding='utf8')
print(file)
# <_io.TextIOWrapper name='test.txt' mode='r' encoding='utf-8'>
- close() function: Close the open file and clear the occupied file handle
file = open('test.txt','r',encoding='utf8')
file.close()
3. with func() as VAR:
Opening a file eliminates the need to explicitly call the close() function to close the occupied handle
with open('a.txt','w',encoding='gbk') as file:
file.write('Police\n')
with open('a.txt','r',encoding='gbk') as file:
data = file.readlines()
print(data)
# Create a.txt file
# [police \n']
Reading operation of files
- read() function: read the contents of the file, and the return value of the function is the contents of the read file
After reading, the cursor moves to the end of the file
close() function should be called after reading the file to close the read file and clear the occupied file handle.
file = open('test.txt','r',encoding='utf8')
data = file.read()
print(data)
file.close()
# 11111111
# 22222222
# 33333333
- readable() function: Determines whether the file is readable, and the return value of the function is True or False
file = open('test.txt','r',encoding='utf-8')
data1 = file.readable()
print(data1)
# True
- readline() function: scan and read a line of files one by one
Read a default line wrap, add the "end=" parameter, remove the line wrap
file = open('test.txt','r',encoding='utf-8')
print('The first line:',file.readline())
print('The second line:',file.readline(),end='')
print('The third line:',file.readline(),end='')
print('The fourth line:',file.readline(),end='')
# Line 1: 11111111111
#
# Line 2: 22222222
# The third line: 333333, the fourth line:
- readlines() function:
file = open('test.txt','r',encoding='utf-8')
print(file.readlines())
# ['11111111\n', '22222222\n', '33333333\n']
Writing of files
If the file exists, empty the file and prepare to write to it
If the file does not exist, create an empty file ready to write to the content
file = open('test.txt','w',encoding='utf8')
print(file)
print(file.readable())
# <_io.TextIOWrapper name='test.txt' mode='w' encoding='utf8'>
# False
- write() function: Write content to a file
Only one type of str data can be written to a file
Pay attention to adding newline character " n" to change lines
file = open('test.txt','w',encoding='utf8')
file.write('11111')
file.write('\n22222\n')
file.close()
file = open('test.txt','r',encoding='utf8')
print(file.read())
# Automatically create test.txt file
# 11111
# 22222
- writable() function: test whether the file is writable
file = open('test.txt','w',encoding='utf8')
print(file.writable())
# True
- writelines() function:
If a file exists, it will overwrite the contents of the original file.
file = open('test.txt','w',encoding='utf8')
file.writelines(['33333\n','44444'])
file.close()
file = open('test.txt','r',encoding='utf8')
print(file.read())
# Cover the original content of test.txt
# 33333
# 44444
File Addition
If the file exists, add content to the file
If the file does not exist, create an empty file to append content to the empty file
file = open('test1.txt','a',encoding='utf-8')
file.write('11111\n')
file.write('11111\n')
file.close()
file = open('test1.txt','r',encoding='utf-8')
print(file.read())
# Create the test1.txt file
# 11111
# 11111
Modification of files
The essence of file modification is: first read the contents of the file (if the file exists) into memory, after modifying in memory, overwrite the original file, and the final result looks like a direct modification of the file.
with open('a.txt','r',encoding='gbk') as file:
data = file.readlines()
print(data)
with open('a.txt','w',encoding='gbk') as file:
file.writelines([data[0],'Catch thief'])
# The content of a.txt file:
# policeman and thief
b (binary: binary) mode of file
Binary mode does not need to specify a string encoding format, otherwise error is reported
The string we see must be "coded" or "decoded" by encoding formats between the binary stored on disk and the string.
- There are two ways to convert a string to binary. The encode() function is more recommended here.
(1). bytes() function
//Format:bytes('str',encoding='FORMAT')
//Usage:
a = bytes('abc Police',encoding='utf-8')
print(type(a),'\n',a)
# <class 'bytes'>
# b'abc\xe8\xad\xa6\xe5\xaf\x9f'
(2). encode() function
//Format:'str'.encode('FORMAT')
//Usage:
b = 'abc Thief'.encode('utf-8')
print(type(b),'\n',b)
# <class 'bytes'>
# b'abc\xe5\xb0\x8f\xe5\x81\xb7'
- rb mode of file
Binary mode does not need to specify a string encoding format, otherwise error is reported
Note: The newline character in windows system is "rn"
with open('test.txt','rb') as f:
data = f.read()
print(data)
# b'11111\r\n22222\r\n33333\r\n44444\r\n\xe8\xad\xa6\xe5\xaf\x9f'
decode() function: python built-in function
# The encoding format of test.txt file is utf-8, and UTF-8 must also be used in decoding.
with open('test.txt','rb') as f:
data = f.read()
print(data.decode('utf-8'))
# 11111
# 22222
# 33333
# 44444
# Police
- wb mode of file
Binary mode does not need to specify a string encoding format, otherwise error is reported
with open('aaa.txt','wb') as f:
data = f.write('policeman and thief'.encode('utf-8'))
# Test Writing Results
with open('aaa.txt','r',encoding='utf-8') as f1:
print(f1.read())
# policeman and thief
- File ab mode
Binary mode does not need to specify a string encoding format, otherwise error is reported
with open('aaa.txt','ab') as f:
data = f.write('policeman and thief'.encode('utf-8'))
# Test Writing Results
with open('aaa.txt','r',encoding='utf-8') as f1:
print(f1.read())
# Police Catch Thieves Police Catch Thieves
Other modes of documents
file.name() filename
file.closed() to determine whether it is closed
file.flush() brushes the contents of memory to disk
file.tell() displays the current cursor position
file.seek() moves the cursor and skips over how many bytes (not characters)
Forward, no memory cursor position: file.seek(int,0)
Forward, Memory cursor position: file.seek(int,1)
Inverse order, file.seek(int,2)
file.read() will move the cursor to the end of the file, so it cannot be misused.
- file.seek(int,0) function: forward, not remembering cursor position
with open('aaa.txt','w',encoding='utf-8') as f:
f.write('11111\n22222\n')
with open('aaa.txt','r',encoding='utf-8') as f:
print('Current cursor position:',f.tell())
print(f.read())
f.seek(3) # Equivalent to f.seek(3,0)
print('Current cursor position:',f.tell())
print(f.read())
f.seek(10) # Equivalent to f.seek(10,0)
print('Current cursor position:',f.tell())
print(f.read())
# Current cursor position: 0
# 11111
# 22222
# Current cursor position: 3
# 11
# 22222
# Current cursor position: 10
# 22
- file.seek(int,1) function: forward, memory cursor position
Note: Must be in "rb" mode
with open('aaa.txt','w',encoding='utf-8') as f:
f.write('11111\n22222\n')
with open('aaa.txt','rb') as f:
print('Current cursor position:',f.tell())
f.seek(3,1)
print('Current cursor position:',f.tell())
f.seek(10,1)
print('Current cursor position:',f.tell())
# Current cursor position: 0
# Current cursor position: 3
# Current cursor position: 13
- file.seek(int,2) function: move cursor in reverse order
Note: Must be in "rb" mode
Note: Move the cursor in reverse order. The number of bytes moved must be negative!
with open('aaa.txt','w',encoding='utf-8') as f:
f.write('11111\n22222\n')
with open('aaa.txt','rb') as f:
print('Current cursor position:',f.tell())
f.seek(-6,2)
print('Current cursor position:',f.tell())
print(f.read())
# Current cursor position: 0
# Current cursor position: 8
# b'2222\r\n'
file.truncate() retains truncated bytes (not characters)
You must open it in write mode, but you can't use w and W + modes. You can use wb, r + and other modes.
with open('aaa.txt','w',encoding='utf-8') as f:
f.write('11111\n22222\n')
with open('aaa.txt','wb') as f: #You can also use r + mode
print('Current cursor position:',f.tell())
print(f.truncate(10))
# 10