python common modules (random, hashlib, os, sys)

Keywords: Python encoding SHA1 ascii

random

The random module is used to get random numbers. Let's look at the functions commonly used in the random module:

# Return (0,1), float type
random.random()  

# Return [1,3], int type
random.randint(1, 3)

# Return [1,3], int type
random.randrange(1, 3)  

# Random access to an element in a list
random.choice([3,4,5,2,1, 'kitty'])   

# Random access to two elements in the list, returned in the form of a list
random.sample([3,4,5,2,1, 'kitty'], 2)   

# Return [1,3], float type
random.uniform(1,3)     

# Randomly disrupt the order of elements in class table lst
lst = [111,222,333,444]
random.shuffle(lst) 

Examples (random access validation code), 5-bit validation code, including integers, upper and lower case letters~

def valdate_code():
    res = ''
    for i in range(5):
        num = random.randint(0, 9)
        alpha_lower = chr(random.randint(97, 122))    # Lowercase letters
        alpha_upper = chr(random.randint(65, 90))     # Capital
        s = random.choice([str(num), alpha_lower, alpha_upper])
        res += s

    return res

//Call result:
8Rj0x
306GX
...

hashlib

hashlib module provides common summary algorithms, such as MD5, SHA1, etc.
Summary algorithm refers to the conversion of arbitrary length data into a fixed length string through a function, usually represented by a hexadecimal string.~

import hashlib

md5_obj = hashlib.md5()
md5_obj.update(b"hello world")
print(md5_obj.hexdigest())           # 5eb63bbbe01eeed093cb22bb8f5acdc3

# Now change a letter to hello world
md5_obj.update(b"hello World")
print(md5_obj.hexdigest())           # 4245dd40eaf3111caa3c8f9e3ceeed3c

The data is changed from'hello World'to'hello World', and the summary information obtained is completely different. So the digest algorithm is generally used to extract the feature codes of data.

Function is a one-way function, it is easy to calculate the characteristic codes from data, but it is difficult to deduce the data from the characteristic codes. The feature codes extracted from the data can be used to determine whether the data has been tampered with.~

If we want to get the signature of a large file now, we can update() the data of this line once for each line read, and then calculate the signature at the end. The example is as follows:

import hashlib

md5_obj = hashlib.md5()
with open(file='/Users/luyi/tmp/passwd', mode='r', encoding='utf-8') as f:
    for line in f:
        md5_obj.update(line.encode('utf-8'))

print(md5_obj.hexdigest())

Tip: In Python 3, the parameters passed to update must be bytes. Strings in Python 3 are stored in memory by default in unicode form, requiring string encode in unicode form to be bytes type for further operation.~

 
The digest algorithms used in the above examples are all md5. MD5 is a common digest algorithm, which generates a fixed 128 bit byte, usually represented by a 32 bit hexadecimal string. In addition to md5, there is a digest algorithm, sha1, which is called in a way similar to md5. The result of SHA1 is 160 bit bytes, usually represented by a 40-bit hexadecimal string.

import hashlib

sha1_obj = hashlib.sha1()
sha1_obj.update(b'hello world')
sha1_obj.update(b'hello kitty')
print(sha1_obj.hexdigest())      # 563258876190465d493543b96306a92164ac7e62

In addition to md5 and sha1 algorithms, there are also sha256 and sha512 algorithms. These two algorithms get longer and safer digests, but the computing speed is slower.~

 
The length of the feature code is arbitrary, but the length of the feature code (summary information) is fixed. This may happen. Two different data have the same feature code, which is called collision, but the probability is not very high.~

Algorithms are also commonly used to save passwords, which are first encrypted unilaterally and then stored in the database. When the password needs to be verified, the password entered by the user is also encrypted unilaterally, and then compared with the stored password in the database.~
 
But then there is another problem. If the password set by the user is too simple, for example, many people will use passwords like'123456','admin','password'. If the company loses the table where the user information is stored, the hexagger can calculate the md5 value of these simple passwords in advance, and then compare with the password encrypted in the table, so that some users'passwords will be obtained by the hexagger.

e10adc3949ba59abbe56e057f20f883e  123456
21232f297a57a5a743894a0e4a801fc3   admin
5f4dcc3b5aa765d61d8327deb882cf99  password

 
The solution is to "salt" the original password. That is, the original password is added with its unique string, such as the user's password plus its user name, and then a single encryption operation is carried out, so that even if the user uses the same password, the summary information obtained after adding the user name will not be the same.~

import hashlib

md5_obj = hashlib.md5(b'kitty')     # Salt is added here.
md5_obj.update(b"123456")
print(md5_obj.hexdigest())

os

os module is an interface to interact with operating system
Common methods are as follows:

os.getcwd()  # Get the current working directory, which is the directory path where the current python script works
os.chdir("dirname")  # Change the current script working directory; equivalent to cd under shell
os.curdir  # Return to the current directory: ('.')
os.pardir  # Gets the parent directory string name of the current directory: ('.')

os.makedirs('dirname1/dirname2')    # Multilayer recursive directories can be generated
os.removedirs('dirname1')    # If the directory is empty, delete it, and recurse to the next directory. If it is also empty, delete it, and so on.
os.mkdir('dirname')    # Generate a single-level directory; equivalent to mkdir dirname in the shell
os.rmdir('dirname')    # Delete single-level empty directory, if not empty directory can not be deleted, error reporting; equivalent to rmdir dirname in shell
os.listdir('dirname')    #Lists all files and subdirectories in the specified directory, including hidden files, and prints them as a list
os.remove()  # Delete a file
os.rename("oldname","newname")  # rename the file/directory. Note that rename is not possible if the file is not open

os.stat('path/filename')  # Getting File/Directory Information
os.sep    # Output operating system-specific path separators, under win for "\" and under Linux for "/"
os.linesep    # Output the line terminator used by the current platform, with t\n under win and \n under Linux
os.pathsep    # The output string for splitting file paths is under win; under Linux is:
os.name    # The output string indicates the current platform to use. Win - >'nt'; Linux - >'posix'
os.system("bash command")  # Run the shell command to display it directly
os.environ  # Getting System Environment Variables

os.path.abspath(path)  # Returns the absolute path normalized by path
os.path.split(path)  # Divide path into directory and file name binaries and return
os.path.dirname(path)  # Returns the path directory. It's actually the first element of os.path.split(path)
os.path.basename(path)  # Returns the last file name of path. How path ends with / or, then null values are returned. That's the second element of os.path.split(path)
os.path.exists(path)  # If path exists, return True; if path does not exist, return False.
os.path.isabs(path)  # If path is an absolute path, return True
os.path.isfile(path)  # If path is an existing file, return True. Otherwise return to False
os.path.isdir(path)  # If path is an existing directory, return True. Otherwise return to False
os.path.join(path1[, path2[, ...]])  # When multiple paths are combined and returned, the parameters before the first absolute path are ignored.
os.path.getatime(path)  # Returns the last access time of the file or directory pointed to by path
os.path.getmtime(path)  # Returns the last modification time of the file or directory pointed to by path
os.path.getsize(path)   # Return path size

The following are often used:

os.remove()
os.path.abspath(path)
    os.path.abspath(__file__)   # Get the path of the current execution script
os.path.dirname(path) 
os.path.basename(path) 
os.path.exists(path) 
os.path.isfile(path)
os.path.isdir(path)
os.path.join(path1[, path2[, ...]]) 
    os.path.join('/','etc', 'passwd')   # /etc/passwd
os.path.getsize(path)
    os.path.getsize('/etc/passwd')   # 6804, unit byte

sys

os module is the module that interacts with the operating system, where sys is the module that interacts with the python interpreter
List common methods.

The common methods of sys are as follows:

sys.argv           # Command line parameters, returned in List form, the first element is the path of the program itself
sys.exit(n)        # Exit the program, exit(0) is the normal exit, and return code is the parameter.
sys.version        # Get version information for the Python interpreter
sys.maxint         # Maximum Int value
sys.path           # Returns the module's search path, initializing with the value of the PYTHONPATH environment variable
sys.platform       # Return the name of the operating system platform
sys.getdefaultencoding()    # Get the current default code of the system, Python 2 defaults to ascii, Python 3 defaults to utf-8.
sys.setdefaultencoding()     # Python 2 sets the system default encoding, dir (sys) execution will not see this method, in the interpreter failed to execute, need to execute reload(sys), and then set. Python 3 does not have this method, nor can it reload(sys)~

sys.getfilesystemencoding()  # Get the file system using encoding, return'mbcs'under Windows and'utf-8' under mac.

sys.stdin,sys.stdout,sys.stderr  # Stdin, stdout, and stderr variables contain stream objects corresponding to standard I/O flows. If output needs to be better controlled and print cannot meet the requirements, stdin, stdout, stderr can be used instead. At this point, you can redirect the output or input to other device s, or process them in a non-standard way.~

sys.argv

When the Python program is run from the command line, the execution files and parameters of the command line are stored in the sys.argv variable in the form of a list.~

sys_test.py The document reads as follows:
import sys
print(sys.argv)

//Command line execution:
➜  ~ python ~/tmp/sys_test.py 1 2 3 4 5 6
['/Users/luyi/tmp/sys_test.py', '1', '2', '3', '4', '5', '6']

sys.exit(n)

After the program is executed, the python interpreter exits automatically. If for some reasons it needs to exit halfway, sys.exit(n) can be used. The parameter n can specify the status code at the time of exit. Normally n=0 indicates normal exit, and other values (1-127) indicate abnormal exit.~

Example:

# Perform the following py file
import sys
sys.exit(2)

➜  tmp python sys_test.py 
➜  tmp echo $?
2                                # The status return code is 2

Midway exit can also be captured using SystemExit to accomplish the necessary things before exit in except

print('start...')
try:
    sys.exit(1)
except SystemExit:
    print('end...')
    sys.exit(0)

print('contimue')

# Output results:
start...
end...

sys.path

sys.path is a list of search paths for modules. If the module you need to use is not in these paths, you can add the path directly to this variable, and the import in the program can import the module correctly.~

>>> import sys
>>> sys.path
['', '/Library/Frameworks/Python.framework/Versions/3.6/lib/python36.zip', '/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6', '/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload', '/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages']

sys.getdefaultencoding(),sys.setdefaultencoding()

When the python program involves the conversion between unicode type and coded string (python 2 is the conversion between str type and unicode type, Python 3 is the conversion between str type and bytes type), the details of this block can be seen. http://blog.51cto.com/ljbaby/2164480 ) The encoding of the getdefaultencoding output is used for conversion.~

The default code in Python 2 is ascii and in Python 3 is utf-8.~

# python2
import sys
print sys.getdefaultencoding()

//Output results:
ascii

# python3
import sys
print(sys.getdefaultencoding())

//Output results:
utf-8

In Python 3, when str type (all strings in Python 3 are stored in unicode) and bytes type are merged, direct error will be reported:

x = 'Hello,'                        # str type
y = 'Bei Bei'.encode('utf-8')        # bytes type
print(x + y)

//Error reporting information:
TypeError: must be str, not bytes

But in Python 2, this process can be carried out. The Python interpreter automatically converts str into Unicode and then performs operations. The results of operations are also of Unicode type. When the Python interpreter automatically converts str into unicode, the Python interpreter defaults to use the encoding in get default encoding because it does not specify which encoding to use for transcoding. The default encoding is ascii, so the following error occurs:

x = u'Hello,'
y = 'Bei Bei'
print x + y

//Error message:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 0: ordinal not in range(128)

Set the default encoding to output normally:

# -*- coding: utf-8 -*-

import sys
reload(sys)
sys.setdefaultencoding('utf-8')
x = u'Hello,'
y = 'Bei Bei'
print x + y

//Output results:
//Hello, Beibei

Tip: This is often used in Python 2. There is no such operation in Python 3. sys cannot reload and there is no set default encoding method ~. The default encoding in Python 3 is utf-8, and there is no need to modify it.~

.................^_^

Posted by TheOracle on Sun, 03 Feb 2019 11:03:15 -0800