Python 3 standard library: tempfile temporary file system object

Keywords: Python encoding Attribute

1. tempfile temporary file system object

It is very difficult to create a temporary file with unique name safely to prevent it from being guessed by the person who is trying to destroy the application or steal the data. tempfile module provides multiple functions to create temporary file system resources safely. TemporaryFile() opens and returns an unnamed file, namedtempraryfile() opens and returns a named file. SpooledTemporaryFile saves the content in memory before writing it to disk. TemporaryDirectory is a context manager, which will be deleted when context is closed.

1.1 temporary documents

If your application needs temporary files to store data without sharing them with other programs, you should use the TemporaryFile() function to create the files. This function creates a file and, if supported by the platform, immediately disconnects the new file. In this way, it is impossible for other programs to find or open this file, because there is no reference to this file in the file system table. For the file created by TemporaryFile(), whether by calling close() or using the context manager API and with statement, the file will be automatically deleted when it is closed.

import os
import tempfile

print('Building a filename with PID:')
filename = '/guess_my_name.{}.txt'.format(os.getpid())
with open(filename, 'w+b') as temp:
    print('temp:')
    print('  {!r}'.format(temp))
    print('temp.name:')
    print('  {!r}'.format(temp.name))

# Clean up the temporary file yourself.
os.remove(filename)

print()
print('TemporaryFile:')
with tempfile.TemporaryFile() as temp:
    print('temp:')
    print('  {!r}'.format(temp))
    print('temp.name:')
    print('  {!r}'.format(temp.name))

This example shows the differences between different methods of creating temporary files. One is to use a general pattern to construct the file name of temporary files, and the other is to use the TemporaryFile() function. TemporaryFile() returned a file without a filename.

By default, the file handle is created using the pattern 'w+b' so that it behaves consistently on all platforms and allows callers to read and write to the file.  

import os
import tempfile

with tempfile.TemporaryFile() as temp:
    temp.write(b'Some data')

    temp.seek(0)
    print(temp.read())

After writing the file, you must use seek() "swing" the file handle to read back the data from the file.

To open a file in text mode, set mode to 'w+t' when creating the file.

import tempfile

with tempfile.TemporaryFile(mode='w+t') as f:
    f.writelines(['first\n', 'second\n'])

    f.seek(0)
    for line in f:
        print(line.rstrip())

This file handle will process the data into text.

1.2 named files

In some cases, you may need a named temporary file. For applications that span multiple processes or even hosts, naming files is the easiest way to transfer files between different parts of an application. The NamedTemporaryFile() function creates a file, but does not break its link, so it keeps its filename (accessed with the name attribute).  

import os
import pathlib
import tempfile

with tempfile.NamedTemporaryFile() as temp:
    print('temp:')
    print('  {!r}'.format(temp))
    print('temp.name:')
    print('  {!r}'.format(temp.name))

    f = pathlib.Path(temp.name)

print('Exists after close:', f.exists())

The file will be deleted when the handle is closed.

1.3 spooling files

If the temporary file contains relatively little data, it may be more efficient to use spooledtemporalyfile, because it uses an io.BytesIO or io.stringIO buffer to save content in memory until the data reaches a threshold value, the data will "scroll" and write to disk, and then replace the buffer with the regular temporalyfile().  

import tempfile

with tempfile.SpooledTemporaryFile(max_size=100,
                                   mode='w+t',
                                   encoding='utf-8') as temp:
    print('temp: {!r}'.format(temp))

    for i in range(3):
        temp.write('This line is repeated over and over.\n')
        print(temp._rolled, temp._file)

This example uses the private property of SpooledTemporaryFile to determine when to scroll to disk. You rarely need to check this state unless you want to resize the buffer.

To display the write buffer to disk, you can call the rollover() or fileno() methods.  

import tempfile

with tempfile.SpooledTemporaryFile(max_size=1000,
                                   mode='w+t',
                                   encoding='utf-8') as temp:
    print('temp: {!r}'.format(temp))

    for i in range(3):
        temp.write('This line is repeated over and over.\n')
        print(temp._rolled, temp._file)
    print('rolling over')
    temp.rollover()
    print(temp._rolled, temp._file)

In this case, because the buffer is very large, far larger than the actual amount of data, no files will be created on the disk unless rollover() is called.

1.4 temporary directory

When multiple temporary files are needed, it may be more convenient to create a temporary directory with temporary directory and open all files in the directory.  

import pathlib
import tempfile

with tempfile.TemporaryDirectory() as directory_name:
    the_dir = pathlib.Path(directory_name)
    print(the_dir)
    a_file = the_dir / 'a_file.txt'
    a_file.write_text('This file is deleted.')

print('Directory exists after?', the_dir.exists())
print('Contents after:', list(the_dir.glob('*')))

The context manager generates the directory name, which can be used to create other file names in the context block.

1.5 location of temporary documents

If you do not use the dir parameter to specify an explicit target location, the path used by the temporary file changes based on the current platform and settings. The tempfile module includes two functions to query the settings used at run time.

import tempfile

print('gettempdir():', tempfile.gettempdir())
print('gettempprefix():', tempfile.gettempprefix())

gettempdir() returns the default directory containing all temporary files, and gettempprefix() returns the new file and directory name and string prefix.

The value returned by gettempdir() is set according to a simple algorithm. It looks up a list of locations and the first location that allows the current process to create files.

import tempfile

tempfile.tempdir = '/I/changed/this/path'
print('gettempdir():', tempfile.gettempdir())

If the program needs to use a global location for all temporary files, but does not use any of the above environment variables, you should directly set tempfile.tempdir to assign a value to the variable.

Posted by Patrick on Sun, 15 Mar 2020 19:17:23 -0700