1, Python 3 makes a distinction between text and binary data. Text is Unicode encoded, str type, for display. The binary type is the bytes type, which is used for storage and transmission. Bytes is a sequence of bytes, and str is a sequence of Unicode.
str type:
1 >>> s = u'Hello' 2 >>> s 3 'Hello' 4 >>> type(s) 5 <class 'str'>
bytes type:
1 >>> b = b'abc' 2 >>> b 3 b'abc' 4 >>> type(b) 5 <class 'bytes'>
2, Conversion relationship between STR and bytes: STR -- > encode() -- > bytes -- > decode() -- > str
Conversion method 1: encode(), decode()
1 >>> a = u'Hello' 2 >>> b = a.encode('utf-8') 3 >>> b 4 b'\xe4\xbd\xa0\xe5\xa5\xbd' 5 >>> type(b) 6 <class 'bytes'> 7 >>> new_a = b.decode('utf-8') 8 >>> new_a 9 'Hello' 10 >>> type(new_a) 11 <class 'str'>
Conversion mode 2: bytes(), str()
1 >>> a = u'Hello' 2 >>> b= bytes(a, encoding='utf-8') 3 >>> b 4 b'\xe4\xbd\xa0\xe5\xa5\xbd' 5 >>> type(b) 6 <class 'bytes'> 7 >>> new_a = str(b, encoding='utf-8') 8 >>> new_a 9 'Hello' 10 >>> type(new_a) 11 <class 'str'>
3, bytearray type
The bytearray class is a variable sequence with range 0 < = x < 256.
The optional source parameters can initialize arrays in several different ways:
- If it is a string, you must also give the encoding (and optional error) parameter; bytearray() and then use the str.encode() convert string to bytes.
- If it is an integer, the array will have this size and will be initialized with null bytes.
- If it is a buffer interface compliant object, the read-only buffer of the object is used to initialize the byte array.
- If it is iterative, it must be an iteration of an integer with range 0 < = x < 256, which is used as the initial content of the array
- If there are no parameters, an array of size 0 is created.
When the source parameter is a string:
1 >>> b = bytearray(u'Hello', encoding='utf-8') 2 >>> b 3 bytearray(b'\xe4\xbd\xa0\xe5\xa5\xbd') 4 >>> type(b) 5 <class 'bytearray'>
When the source parameter is an integer:
1 >>> b = bytearray(5) 2 >>> b 3 bytearray(b'\x00\x00\x00\x00\x00') 4 >>> type(b) 5 <class 'bytearray'>
When the source parameter is an iterative object, all elements of the iterative object must conform to 0 < = x < 256:
1 >>> b = bytearray([1, 2, 3, 4, 255]) 2 >>> b 3 bytearray(b'\x01\x02\x03\x04\xff') 4 >>> type(b) 5 <class 'bytearray'
4, Difference between bytes and byte array
bytes are immutable, the same as str. bytearray is variable, the same as list.
1 >>> b = bytearray() 2 >>> b 3 bytearray(b'') 4 >>> b.append(10) 5 >>> b 6 bytearray(b'\n') 7 >>> b.append(100) 8 >>> b 9 bytearray(b'\nd') 10 >>> b.remove(100) 11 >>> b 12 bytearray(b'\n') 13 >>> b.insert(0, 150) 14 >>> b 15 bytearray(b'\x96\n') 16 >>> b.extend([1, 3, 5]) 17 >>> b 18 bytearray(b'\x96\n\x01\x03\x05') 19 >>> b.pop(2) 20 1 21 >>> b 22 bytearray(b'\x96\n\x03\x05') 23 >>> b.reverse() 24 >>> b 25 bytearray(b'\x05\x03\n\x96') 26 >>> b.clear() 27 >>> b 28 bytearray(b'')
5, bytes and byte array conversion
1 >>> b = b'abcdef' 2 >>> bay = bytearray(b) 3 >>> bay 4 bytearray(b'abcdef') 5 >>> b = bytes(bay) 6 >>> b 7 b'abcdef'
6, bytearray and str conversion
1 >>> a = 'abcdef' 2 >>> b = bytearray(a, encoding='utf-8') 3 >>> b 4 bytearray(b'abcdef') 5 >>> a = b.decode(encoding='utf-8') 6 >>> a 7 'abcdef'