20190127 - split a file into multiple new files

Keywords: Python encoding less

1. Split one file into five files

Train of thought:

1. First, split it into multiple new files. The new file name needs to be defined. Use file name? No to define the name of the new file

2. The content of the new file should be stored by file content. When the new file is written, use file content = '' to empty the new file content

3. When to write a new file: consider disassembling according to the content of the original file. If a file is divided into five files, the number of lines of the original file is divided by five to write the original content into five new files as evenly as possible. Based on this situation, a variable is needed to count the number of lines of the original file, and the file "length" is used to store the number of lines of the original file. Then start reading line by line again, and write the file when the number of lines read reaches 1 / 5, 2 / 5, 3 / 5, 4 / 5

def split_file(file_dir,file_name,num):
    import os
    import os.path
    file_name_no=1
    file_content=''
    #print(type(os.path.splitext(file_name)))
    with open(file_dir+file_name,'r',encoding='utf-8') as fp1:
        file_length=len(fp1.readlines())
        #fp1.readlines()After, the cursor moves to the end of the file, so the fp1.seek(0,0)Move the cursor to the beginning of the file
        print(file_length)
        fp1.seek(0,0)
        #Use fp1.seek(0,0)Move the cursor to the beginning of the file
        file_line=0
        #Use file_line Record the number of rows read
        for line in fp1:
            file_content+=line
            file_line+=1
            if file_line==int(file_length/num)*file_name_no and file_name_no<num:
                #When the number of lines read reaches 1 of the file content/num,2/num...Write the contents of the file to a new file if there are lines in the file/When the number of splits is an integer, it is just right. Otherwise, there will be less content in the last file
                with open(file_dir+'new'+os.path.splitext(file_name)[0]+str(file_name_no)+'.txt','w',encoding='utf-8') as fp2:
                    file_name_no+=1
                    fp2.write(file_content)
                    file_content=''
        if file_content:
            #Solve the problem that the last file may have less content because the number of lines in the file does not necessarily split the integral multiple of the file
            with open(file_dir+'new'+os.path.splitext(file_name)[0]+str(num)+'.txt','a',encoding='utf-8') as fp2:
                fp2.write(file_content)            
split_file('D:\\Python\\','b.txt',3)

 

Tips: File dir is the path of the file, file name is the name of the file, use os.path.splitext (file name) [0] to get the original file name, and use file dir + 'new' + os.path.splitext (file name) [0] + str (file name no) + '. TXT' to splice and write the file name. When a file is written, add 1 to file name, empty file content, and finally

Posted by aveach on Sat, 30 Nov 2019 15:40:26 -0800