[Python 3 learning notes - basic syntax] learn about standard libraries in Python

Keywords: Python crawler regex

Operating system interface

The os module provides many functions associated with the operating system.

>>> import os
>>> os.getcwd()      # Returns the current working directory
>>> os.chdir('/server/accesslogs')   # Modify the current working directory
>>> os.system('mkdir today')   # Execute the system command mkdir 

It is recommended to use the "import os" style instead of "from os import *. This ensures that os.open(), which varies with the operating system, does not override the built-in function open().

The built-in dir() and help() functions are very useful when using large modules such as os:

>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>

If you are interested in learning Python together, click the following icon to add a group:

For daily file and directory management tasks, the: mod:shutil module provides an easy-to-use high-level interface:

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
>>> shutil.move('/build/executables', 'installdir')

File wildcard

The glob module provides a function to generate a file list from a directory wildcard search:

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

Command line parameters

Common tool scripts often call command line parameters. These command-line parameters are stored in the argv variable of sys module in the form of linked list. For example, after executing "python demo.py one two three" on the command line, you can get the following output results:

>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']

Error output redirection and program termination

sys also has stdin, stdout, and stderr attributes, which can be used to display warning and error messages even when stdout is redirected.

>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one

Most scripts use "sys.exit()" for directed termination.

String regular matching

The re module provides regular expression tools for advanced string processing. For complex matching and processing, regular expressions provide concise and optimized solutions:

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

If you only need simple functions, you should first consider string methods because they are very simple and easy to read and debug:

>>> 'tea for too'.replace('too', 'two')
'tea for two'


The math module provides access to the underlying C function library for floating-point operations:

>>> import math
>>> math.cos(math.pi / 4)
>>> math.log(1024, 2)

Random provides a tool for generating random numbers.

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
>>> random.sample(range(100), 10)   # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # random float
>>> random.randrange(6)    # random integer chosen from range(6)

Access to the Internet

There are several modules for accessing the Internet and handling network communication protocols. The simplest two are urllib.request for processing data received from urls and SMTP lib for sending e-mail:

>>> from urllib.request import urlopen
>>> for line in urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl'):
...     line = line.decode('utf-8')  # Decoding the binary data to text.
...     if 'EST' in line or 'EDT' in line:  # look for Eastern Time
...         print(line)

<BR>Nov. 25, 09:43:32 PM EST

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
... Beware the Ides of March.
... """)
>>> server.quit()

Note that the second example requires a running mail server locally.

Date and time

The datetime module provides both simple and complex methods for date and time processing.

While supporting date and time algorithms, the implementation focuses on more efficient processing and formatted output.

The module also supports time zone processing:

>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'

>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days

data compression

The following modules directly support common data packaging and compression formats: zlib, gzip, bz2, zipfile, and tarfile.

>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
>>> t = zlib.compress(s)
>>> len(t)
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)

Performance metrics

Some users are interested in understanding the performance differences between different methods to solve the same problem. Python provides a measurement tool that provides direct answers to these questions.

For example, using tuple encapsulation and unpacking to exchange elements seems much more attractive than using traditional methods, and timeit proves that modern methods are faster.

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()

Compared with the fine granularity of timeit, the: mod:profile and pstats modules provide time measurement tools for larger blocks of code.

Test module

One of the ways to develop high-quality software is to develop test code for each function, and test it often in the development process

The doctest module provides a tool to scan the module and perform tests according to the document strings embedded in the program.

The test construct is like simply cutting and pasting its output into a document string.

Through the example provided by the user, it strengthens the document and allows the doctest module to confirm whether the result of the code is consistent with the document:

def average(values):
    """Computes the arithmetic mean of a list of numbers.

    >>> print(average([20, 30, 70]))
    return sum(values) / len(values)

import doctest
doctest.testmod()   # Automatic verification embedded test

unittest module is not as easy to use as doctest module, but it can provide a more comprehensive test set in a separate file:

import unittest

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        self.assertRaises(ZeroDivisionError, average, [])
        self.assertRaises(TypeError, average, 20, 30, 70)

unittest.main() # Calling from the command line invokes all tests

If you are interested in learning Python together, click the following icon to add a group:

Posted by mattsoftnet on Sun, 24 Oct 2021 11:27:01 -0700