The operation of Day8 flie

Keywords: Python encoding Windows sublime

File operation

  • Open the file 2. Operate on the file handle 3. Close the file.

open() has three parameters: 1. File path (folder path + file name + file type); 2. encoding mode; 3. Mode

fl = open('d:\python.txt',encoding='utf-8',mode='r')   #d:\python.txt is a TXT file named Python under my disk D. when I write this file, I write it with sublime software and save it in utf-8 encoding format. Therefore, the encoding parameter is utf-8.
content = fl.read()
print(content)
fl.close      #Always close the file after each operation
>>>i love python

Open: built in function. The open bottom layer calls the interface of the operating system.

FL: variable, a conventional variable set during file operation, is also written as F1, f h, file_handler, f_h, etc. it is also called file handle (but it is just a conventional variable, which can be changed at will, but it is better not to change). Any operation on a file requires the method: file handle. (fl.raed(), etc.).

encoding: can not be written, and the parameters will be opened in the operating system's default codebook (windows default code: GBK (windows 10 is utf-8, Linux: utf-8, mac: utf-8), but it is better to write that the memory is full of Unicode code, while the files in memory are not (see python basic learning day7 for details)

mode: you can not write by default. If you do not write by default, you can open it as read-only (r)

Common error reporting reasons:

  1. Unicode decodeerror: the codebook is not the same when the file is saved as when the file is opened.

  2. There is a problem with the path separator: (backslash) it has the meaning of escape character, such as \ n, t, u, etc. if the file is placed in the c disk, it may report an error. As follows:

    fl = open('C:\Users\a1566\Desktop\python.txt',encoding='utf-8',mode='r')
    content = fl.read()
    print(content)
    >>>SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
    
        #Solution: add r before the file path to invalidate the escape character
    fl = open(r'C:\Users\a1566\Desktop\python.txt',encoding='utf-8',mode='r')
    content = fl.read()   
    print(content)
    >>>i love python
  • File read:

    • r , rb , r+, r+b.

    • read() if there is no parameter in the bracket, read all at once. If the parameter (number) is written, it can be read according to the character (starting from 1). The line break in the file is counted as one character.

      fl = open(r'd:\python.txt',encoding='utf-8',mode='r')
      content = fl.read()
      print(content)
      fl.close()
      >>>i love python
      content = fl.read(5)
      print(content)
      >>>i lov
    • readline() reads a line if there is no parameter in the bracket. If the parameter (number) is written, the character can be read according to the character (starting from 1) (the same as read). The line break in the file is calculated as a character. Note that there is a line break in the text, and the print() function also breaks by default.

      fl = open(r'd:\python.txt',encoding='utf-8',mode='r')
      content = fl.readline()
      print(content)  #You can change the default output structure of print to cancel the line break of the default output of the print function: print(content,end = '')
      >>>i love python. 
           #Line breaks are also read
    • If there is no parameter in the brackets, readlines() will read all the lines and return the list. Each element of the list is each line of the source file. If the parameter (number) is written, it can be read according to each line.

      fl = open(r'd:\python.txt',encoding='utf-8',mode='r')
      content = fl.readlines()
      print(content,end='')
      fl.close()
      >>>['i love python.\n', 'i love you too.']
    • Circular reading, the file handle can be traversed (the file handle is an iterator, each for loop only reads a line of the file, saving memory, while read,readlines, etc. are read into memory at one time, if the file is too large, there will be problems).

      f = open(r'd:\python.txt',encoding='utf-8')
      for line in f:
          print(line)
      f.close()
      >>>i love python.
      
      i love you too.
    • rb: the operation of non text files, such as: pictures, video, audio and so on. rb mode does not use encoding

      fl = open(r'd:\Snow scene.jpg',mode='rb')  #Snowscape.jpg is a picture
      print(fl.read())
      >>>   #Too many bytes, please self test
    • r + read / write function (read and add), read before write is recommended

      f = open(r'd:\python.txt',encoding='utf-8',mode='r+')
      f.read()
      f.write('1234567')  #Read first and write later.
      f.close()
      • The reading of the file has pointer (cursor) positioning, and the starting position of the pointer in the file is at the front of the file. In r + mode, if read first and then write, the content written will be at the end of the file. If write first and then read is used, it will be written from the beginning of the file. Writing a character will overwrite a character until the writing is completed
      • There will be garbled code when writing first and then reading. Because Chinese, English and special characters occupy different bytes, they will be written in overlay mode after writing first and then reading. If the written string occupies different bytes than the string in the file, there will be garbled code and error.
  • File write:

    • Four modes: W, W B, W +, W + B

    • w. wb: if the same file already exists, the contents of the original file will be emptied before writing. If not, it will be created.

      • Empty: after opening a file, the original file will be emptied and then written. However, if the file handle is not closed, it can be written in a circular manner without being emptied. Close the file handle and open the original file in 'w' mode again
      f = open('d:/text.txt',encoding='utf-8',mode='w') #A TXT file named text can be created in the current directory. If there is an existing file, the contents of the file will be emptied before writing 
      f.write('i love python') 
      f.close()
  • File appending:

    • a ,ab ,a+ ,a+b

    • a: If there is no file, the file is created. If any, it shall be added directly after the original document

      f = open('d:/text.txt',encoding='utf-8',mode='a')  
      f.write('\ni love you too') 
      f.close()
  • tell() reads the bit of pointer (cursor) in bytes (utf-8 Code: one Chinese three bytes, one letter one byte, see day2 for details)

    fl = open(r'd:\python.txt',encoding='utf-8')
    print(f1.tell())
    >>>0
    content = fl.read()
    print(fl.tell)
    >>>25
    fl.close()
  • seek() adjusts the cursor position in bytes

    fl = open(r'd:\python.txt',encoding='utf-8')
    print(f1.seek(8))
  • flush() is used to force refresh (save). It is usually used when writing files. After writing, the flush method is generally used for the file handle to avoid saving failure.

    f = open(r'd:\text.txt',encoding='utf-8',mode='w')  
    f.write('\ni love you too') 
    f.flush()
    f.close()
  • Another way to open a file (recommended)

    • with open() as :
      • Advantages: without closing the file handle manually, it will be closed within a certain period of time; one with can operate multiple files.
      • Disadvantages:
    with open(r'd:\text.txt',encoding='utf-8',mode='a') as f:
        f.write('\ni love you too') 
    
    #Open multiple files:
    with open(r'd:\text.txt',encoding='utf-8',mode='a') as f1,open\(r'd:\python.txt',encoding='utf-8',mode='a') as f2:   #The backslash after the second open is line break. No characters are added after the line break. It can be used when a line of code is too long
        f1.write('\ni love you too') 
        f2.read()
  • File modification:

    • The bottom layer of software (word, notebook, etc.) for each major operation file operates the file in the following basic ways:

      • 1. Open the original file in read mode
      • 2. Create a new file in write mode
      • 3. Read out and modify the contents of the original file into new contents and write them into a new file
      • 4. Delete the original file (python needs to introduce os module)
      • 5. Rename the new file (python needs to introduce os module)
    • Real column method (change the lowercase o i n the python.txt file under disk d to uppercase, and the content in the python file is as follows: i love python./n i love you too. The content of the file is simple, please create it by yourself)

      import os   #Introduce os module
      #1. Open the original file in read mode
      #2. Create a new file in write mode
      with open(r'd:\python.txt',encoding='utf-8') as f1,\
      open(r'd:\python.bak',encoding='utf-8',mode='w') as f2:   #. bak is a backup file type
      #3. Read out and modify the contents of the original file into new contents and write them into a new file
          old_content = fl.read()  #Read str type
          new_content = old_content.replace('o','O')
          f2.write(new_content)
      #4. delete the original file    
      os.remove('d:\python.txt')
      #5. Rename the new file
      os.rename('d:\python.bak','d:\python.txt')

      The above method (read) can only be used for small files. In case of large files, problems will occur. Therefore, you can change it as follows:

      import os   #Introduce os module
      #1. Open the original file in read mode
      #2. Create a new file in write mode
      with open(r'd:\python.txt',encoding='utf-8') as f1,\
      open(r'd:\python.bak',encoding='utf-8',mode='w') as f2:   #. bak is a backup file type
      #3. Read out and modify the contents of the original file into new contents and write them into a new file   
          for old_line in f1:
          new_line = old_line.replace('o','O')
          f2.write(new_line)  
      #4. delete the original file    
      os.remove('d:\python.txt')
      #5. Rename the new file
      os.rename('d:\python.bak','d:\python.txt')

Posted by cfemocha on Wed, 18 Mar 2020 09:15:56 -0700