python string type bytes type byte array type

Keywords: Python encoding

1, Python 3 makes a distinction between text and binary data. Text is Unicode encoded, str type, for display. The binary type is the bytes type, which is used for storage and transmission. Bytes is a sequence of bytes, and str is a sequence of Unicode.

str type:

1 >>> s = u'Hello'
2 >>> s
3 'Hello'
4 >>> type(s)
5 <class 'str'>

bytes type:

1 >>> b = b'abc'
2 >>> b
3 b'abc'
4 >>> type(b)
5 <class 'bytes'>

2, Conversion relationship between STR and bytes: STR -- > encode() -- > bytes -- > decode() -- > str

Conversion method 1: encode(), decode()

 1 >>> a = u'Hello'
 2 >>> b = a.encode('utf-8')
 3 >>> b
 4 b'\xe4\xbd\xa0\xe5\xa5\xbd'
 5 >>> type(b)
 6 <class 'bytes'>
 7 >>> new_a = b.decode('utf-8')
 8 >>> new_a
 9 'Hello'
10 >>> type(new_a)
11 <class 'str'>

Conversion mode 2: bytes(), str()

 1 >>> a = u'Hello'
 2 >>> b= bytes(a, encoding='utf-8')
 3 >>> b 
 4 b'\xe4\xbd\xa0\xe5\xa5\xbd'
 5 >>> type(b)
 6 <class 'bytes'>
 7 >>> new_a = str(b, encoding='utf-8')
 8 >>> new_a
 9 'Hello'
10 >>> type(new_a)
11 <class 'str'>

3, bytearray type

The bytearray class is a variable sequence with range 0 < = x < 256.

The optional source parameters can initialize arrays in several different ways:

  • If it is a string, you must also give the encoding (and optional error) parameter; bytearray() and then use the str.encode() convert string to bytes.
  • If it is an integer, the array will have this size and will be initialized with null bytes.
  • If it is a buffer interface compliant object, the read-only buffer of the object is used to initialize the byte array.
  • If it is iterative, it must be an iteration of an integer with range 0 < = x < 256, which is used as the initial content of the array
  • If there are no parameters, an array of size 0 is created.

When the source parameter is a string:

1 >>> b = bytearray(u'Hello', encoding='utf-8')
2 >>> b
3 bytearray(b'\xe4\xbd\xa0\xe5\xa5\xbd')
4 >>> type(b)
5 <class 'bytearray'>

When the source parameter is an integer:

1 >>> b = bytearray(5)
2 >>> b
3 bytearray(b'\x00\x00\x00\x00\x00')
4 >>> type(b)
5 <class 'bytearray'>

When the source parameter is an iterative object, all elements of the iterative object must conform to 0 < = x < 256:

1 >>> b = bytearray([1, 2, 3, 4, 255])
2 >>> b
3 bytearray(b'\x01\x02\x03\x04\xff')
4 >>> type(b)
5 <class 'bytearray'

4, Difference between bytes and byte array

bytes are immutable, the same as str. bytearray is variable, the same as list.

 1 >>> b = bytearray()
 2 >>> b
 3 bytearray(b'')
 4 >>> b.append(10)
 5 >>> b
 6 bytearray(b'\n')
 7 >>> b.append(100)
 8 >>> b
 9 bytearray(b'\nd')
10 >>> b.remove(100)
11 >>> b
12 bytearray(b'\n')
13 >>> b.insert(0, 150)
14 >>> b
15 bytearray(b'\x96\n')
16 >>> b.extend([1, 3, 5])
17 >>> b
18 bytearray(b'\x96\n\x01\x03\x05')
19 >>> b.pop(2)
20 1
21 >>> b
22 bytearray(b'\x96\n\x03\x05')
23 >>> b.reverse()
24 >>> b
25 bytearray(b'\x05\x03\n\x96')
26 >>> b.clear()
27 >>> b
28 bytearray(b'')

5, bytes and byte array conversion

1 >>> b = b'abcdef'
2 >>> bay = bytearray(b)
3 >>> bay
4 bytearray(b'abcdef')
5 >>> b = bytes(bay)
6 >>> b
7 b'abcdef'

6, bytearray and str conversion

1 >>> a = 'abcdef'
2 >>> b = bytearray(a, encoding='utf-8')
3 >>> b
4 bytearray(b'abcdef')
5 >>> a = b.decode(encoding='utf-8')
6 >>> a
7 'abcdef'

Posted by expostfacto on Thu, 14 May 2020 09:48:49 -0700