String data type of Python full stack path series

Keywords: Programming Python Linux encoding Unix

String (str)

String type is the sequence type of python. Its essence is character sequence. Besides, the string type of python can't be changed. You can't modify the original string, but you can copy part of the string into a new string to achieve the same modification effect.

To create a string type, you can use single quotation mark, double quotation mark or triple quotation mark. The example is as follows:

Single quotation mark

>>> string = 'ansheng'
# Type is to view the data type of a variable
>>> type(string)
<class 'str'>

Double quotation marks

What I don't know in the learning process can add to me
 qun, 784758214, for python learning exchange
 There are good learning video tutorials, development tools and e-books in the group.
Share with you the current talent needs of python enterprises and how to learn python from scratch, and what to learn

>>> string = "ansheng"
#Type is to view the data type of a variable
>>> type(string) 
<class 'str'>

Three quotation marks

>>> string = """ansheng"""
>>> type(string)
<class 'str'>

You can also specify the type

>>> var=str("string")
>>> var
'string'
>>> type(var)
<class 'str'>

String method

In fact, there are many methods for each class. No matter when we are learning or working, there are not many commonly used methods, so we don't need to remember so many. Some methods only need to have an impression on them. If we forget, we can google them.

Capitalize initial

capitalize(self):

>>> name="ansheng"
>>> name.capitalize()
'Ansheng'

Content center, width: the total width of the string; fillchar: fill character, the default fill character is space.

center(self, width, fillchar=None):

# Define a string variable named "string" with the content of "hello word"
>>> string="hello word"
# Output the length of this string, using len(value_name)
>>> len(string)
10
# The total width of the string is 10 and the filled characters are "*"
>>> string.center(10,"*")
'hello word'
# If the total output of strings is set to 11, then there is one position left after subtracting string length 10, which will be occupied by *
>>> string.center(11,"*")
'*hello word'
# Fill from left to right
>>> string.center(12,"*")
'*hello word*'

Count the number of times a character appears in the string. The optional parameters are at the beginning and end of the string search.

count(self, sub, start=None, end=None):

parameter	describe
sub	The substring of the search;
start	Where the string starts the search. The default value is the first character, and the index value of the first character is 0;
end	The position in the string where the search ends. The index of the first character in the character is 0. Default to the last position of the string;

>>> string="hello word"
# The "l" found by default has appeared twice
>>> string.count("l")
2
# If you specify to search from the third location, search to the sixth location, and "l" appears once
>>> string.count("l",3,6)
1

Decode

decode(self, encoding=None, errors=None):

# Define a variable content as Chinese
temp = "Chinese"
# Convert character set of variable to UTF-8
temp_unicode = temp.decode("utf-8")

Encoding, for unicode

encode(self, encoding=None, errors=None):

# Define a variable content as Chinese, character set as UTF-8
temp = u"Chinese"
# Code, you need to specify what code to convert to
temp_gbk = temp_unicode.encode("gbk")

Returns True if the string ends with the specified suffix, or False if the string ends with the specified suffix.

endswith(self, suffix, start=None, end=None):

parameter	describe
suffix	Suffix, which can be a string, or a tuple that looks for a suffix
start	Start, slice from here
end	It's over. It's over

>>> string="hello word"
# Determines whether the string ends with "d", and returns "True" if so
>>> string.endswith("d")
True
# Judge whether the end of "t" has been found in the string. If not, return "False"
>>> string.endswith("t")
False
# Determine the location of the search. In fact, judge from the string positions 1 to 7. If the seventh position is "d", return True, otherwise return False
>>> string.endswith("d",1,7)
False

Turn the tab symbol ('\ t') in the string into a space. The default number of spaces for the tab symbol ('\ t') is 8.

expandtabs(self, tabsize=None):

parameter	describe
tabsize	Specifies the number of characters to convert the tab symbol ('\ t') in the conversion string to a space

>>> string="hello       word"
# When outputting the content of variable "string", you will find that there is a "\ t" in the middle, which is actually a "tab" key
>>> string
'hello\tword'
# Change the 'tab' key to a space
>>> string.expandtabs(1)
'hello word'
# Change the 'tab' key to ten spaces
>>> string.expandtabs(10)
'hello     word'

Check whether the string contains the substring str. if the start and end ranges are specified, check whether they are included in the specified ranges. If the substring is included, return the index value of the start, otherwise return - 1.

find(self, sub, start=None, end=None):

parameter	describe
str	Specifies the retrieved string
start	Start index, default is 0
end	End index, default length of string

>>> string="hello word"
# Returns the position of 'o' in the current string. If the first 'o' is found, it will not continue to look down
>>> string.find("o")
4
# Search from the fifth location and return to the location of 'o'
>>> string.find("o",5)
7

String format, which will be mentioned in later articles.

format(*args, **kwargs):

Check whether the string contains the substring str. if the start and end ranges are specified, check whether they are included in the specified ranges. This method is the same as the python find() method, except that if STR is not in the string, an exception will be reported.

index(self, sub, start=None, end=None):

parameter	describe
str	Specifies the retrieved string
start	Start index, default is 0
end	End index, default length of string

>>> string="hello word"
# Returns the location of the string
>>> string.index("o")
4
# If you look for a string that does not exist, an error will be reported
>>> string.index("a")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

Checks whether a string consists of letters and numbers. Returns True if the string has at least one character and all characters are letters or numbers. Otherwise, returns False

isalnum(self):

>>> string="hes2323"
# Return 'True' if there are numbers or letters, otherwise return 'False'`
>>> string.isalnum()
True
# A space in the middle returns False
>>> string="hello word"
>>> string.isalnum()
False

Checks whether a string is made up of only letters.

isalpha(self):

# Return 'True' if all are letters`
>>> string="helloword"
>>> string.isalpha()
True
# Otherwise, return False
>>> string="hes2323"
>>> string.isalpha()
False

Check whether the string is composed of numbers only

isdigit(self):

# Return 'True' if there are all numbers in the variable, otherwise return 'False'`
>>> string="hes2323"
>>> string.isdigit()
False
>>> string="2323"
>>> string.isdigit()
True

Check whether the string is composed of lowercase letters

islower(self):

# Return 'True' if the variable contents are all lowercase letters, otherwise return 'False'`
>>> string="hesasdasd"
>>> string.islower()
True
>>> string="HelloWord"
>>> string.islower()
False

Checks whether a string is made up of spaces only

isspace(self):

# If the variable content is composed of spaces, return 'True', otherwise return 'False'`
>>> string=" "
>>> string.isspace()
True
>>> string="a"
>>> string.isspace()
False

Checks whether all words in the string are spelled in uppercase and other letters are lowercase.

istitle(self):

# If the content initial of the variable is uppercase and other letters are lowercase, 'True' will be returned, otherwise, 'False' will be returned`
>>> string="Hello Word"
>>> string.istitle()
True
>>> string="Hello word"
>>> string.istitle()
False

Checks whether all letters in the string are uppercase.

isupper(self):

# Return 'True' if all the letters in the variable value are uppercase, otherwise return 'False'`
>>> string="hello word"
>>> string.isupper()
False
>>> string="HELLO WORD"
>>> string.isupper()
True

Generates a new string by connecting elements in the sequence with the specified characters.

join(self, iterable):

>>> string=("a","b","c")
>>> '-'.join(string)
'a-b-c'

Returns the left alignment of the original string and fills it with spaces to the specified length of the new string. Returns the original string if the specified length is less than the length of the original string.

ljust(self, width, fillchar=None):

parameter	describe
width	Specify string length
fillchar	Fill character, default to space

What I don't know in the learning process
 qun, 784758214, for python learning exchange
 There are good learning video tutorials, development tools and e-books in the group.
Share with you the current talent needs of python enterprises and how to learn python from scratch, and what to learn
>>> string="helo word"
>>> len(string)
9
 #Subtract the length of the string from the defined length, and the rest will be filled
>>> string.ljust(15,'*')
'helo word******'

Converts all uppercase characters in a string to lowercase.

lower(self):

# Convert all uppercase variables to lowercase
>>> string="Hello WORD"
>>> string.lower()
'hello word'

Truncates spaces or specified characters to the left of a string

lstrip(self, chars=None):

parameter	describe
chars	Specify the characters to intercept

# Remove matching strings from the left
>>> string="hello word"
>>> string.lstrip("hello ")
'word'

It is used to split the string according to the specified delimiter. If the string contains the specified delimiter, a 3-element tuple is returned. The first is the substring to the left of the delimiter, the second is the delimiter itself, and the third is the substring to the right of the delimiter.

partition(self, sep):

parameter	describe
str	Specified separator

# Returns a tuple type
>>> string="www.ansheng.me"
>>> string.partition("ansheng")
('www.', 'ansheng', '.me')

Replace old (old string) with new (new string). If the third parameter max is specified, the replacement will not exceed max times

replace(self, old, new, count=None):

parameter	describe
old	Substring to be replaced
new	New string to replace old substring
count	Optional string, no more than count times of replacement

>>> string="www.ansheng.me"
# Replace "www." with a new string "https://`
>>> string.replace("www.","https://")
'https://ansheng.me'
# Replace string 'w' with new string 'a' only '2 times
>>> string.replace("w","a",2)
'aaw.ansheng.me'

Returns the last occurrence of the string or - 1 if there is no match.

rfind(self, sub, start=None, end=None):

parameter	describe
str	Found string
start	Location to start finding, default is 0
end	End search position, default to the length of string

>>> string="hello word"
# rfind is actually reverse lookup
>>> string.rfind("o")
7
# Specify the scope to find
>>> string.rfind("o",0,6)
4

Returns the last position of the substring str in the string. If there is no matching string, an exception will be reported. You can specify the optional parameter [start:end] to set the search interval.

rindex(self, sub, start=None, end=None):

parameter	describe
str	Found string
start	Location to start finding, default is 0
end	End search position, default to the length of string

>>> string="hello word"
# Reverse lookup index
>>> string.rindex("o")
7
# Report an error if it is not found
>>> string.rindex("a")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

Returns a new string with the original string right justified and a space filled to the length width. Returns the original string if the specified length is less than the length of the string.

rjust(self, width, fillchar=None):

parameter	describe
width	Specifies the total length of the string after the specified character is filled
fillchar	Filled characters, default to space

>>> string="hello word"
>>> len(string)
10
>>> string.rjust(10,"*")
'hello word'
>>> string.rjust(12,"*")
'**hello word'

Slice the string from right to left by specifying a separator. If the parameter num has a specified value, only num substrings are separated

rsplit(self, sep=None, maxsplit=None):

parameter	describe
str	Separator, default to space
num	Segmentation times

>>> string="www.ansheng.me"
>>> string.rsplit(".",1)
['www.ansheng', 'me']
>>> string.rsplit(".",2)
['www', 'ansheng', 'me']

Delete the specified character (space by default) at the end of the string string

rstrip(self, chars=None):

parameter	describe
chars	Specify characters to delete

# Match delete from tail
>>> string="hello word"
>>> string.rstrip("d")
'hello wor'

Slice the string from left to right by specifying a separator. If the parameter num has a specified value, only num substrings are separated

split(self, sep=None, maxsplit=None):

parameter	describe
str	Separator, default to space
num	Segmentation times

>>> string="www.ansheng.me"
# Specify to cut once and divide by '.'
>>> string.split(".",1)
['www', 'ansheng.me']
# Specify to cut twice and divide by '.'
>>> string.split(".",2)
['www', 'ansheng', 'me']

Separated by rows, returns a list containing rows as elements. If num is specified, only num rows are sliced

splitlines(self, keepends=False):

parameter	describe
num	Number of times to split rows

# Define a variable with line breaks, and '\ n' can line up
>>> string="www\nansheng\nme"
# Output content
>>> print(string)
www
ansheng
me
# Convert a row into a list
>>> string.splitlines(1)
['www\n', 'ansheng\n', 'me']

Checks whether the string begins with the specified substring, returns True if it does, False otherwise. If the parameters start and end specify values, check within the specified range.

startswith(self, prefix, start=None, end=None):

parameter	describe
str	Detected string
start	Optional parameters are used to set the starting position of string detection
end	Optional parameters are used to set the end position of string detection

>>> string="www.ansheng.me"
>>> string.startswith("www")
True
>>> string.startswith("www",3)
False

Remove the characters specified at the beginning and end of the string (the default is space)

strip(self, chars=None):

parameter	describe
chars	Remove characters specified at the beginning and end of a string

>>> string=" www.ansheng.me "
>>> string
' www.ansheng.me '
# Remove Spaces
>>> string.strip()
'www.ansheng.me'
>>> string="_www.ansheng.me_"
# Specify that you want to delete the left and right sides of the "?"
>>> string.strip("_")
'www.ansheng.me'

It is used to convert the upper and lower case letters of a string, with upper case changing to lower case and lower case changing to upper case

swapcase(self):

>>> string="hello WORD"
>>> string.swapcase()
'HELLO word'

Returns the "captioned" string, which means that all words start with uppercase and the rest of the letters are lowercase.

title(self):

>>> string="hello word"
>>> string.title()
'Hello Word'

Convert the characters of the string according to the table (including 256 characters) given by the parameter table, and put the characters to be filtered into the del parameter.

translate(self, table, deletechars=None):

parameter	describe
table	Translation table is transformed by maketrans method
deletechars	List of characters to filter in string

Convert lowercase letters to uppercase letters in a string

upper(self):

>>> string="hello word"
>>> string.upper()
'HELLO WORD'

Returns a string of a specified length. The original string is right justified and filled with 0

zfill(self, width):

parameter	describe
width	Specifies the length of the string. Align the original string to the right, fill 0 in the front

>>> string="hello word"
>>> string.zfill(10)
'hello word'
>>> string.zfill(20)
'0000000000hello word'

str type and bytes type conversion

When encoded in UTF-8, a Chinese character is three bytes and a byte is eight bits

3.5.x example

The code is as follows:

#!/usr/bin/env python
# _*_ coding:utf-8 _*_

var = "Chinese"
for n in var:
    print(n)

print("================")

var2 = "zhongwen"
for n in var2:
    print(n)

Execution result:

C:\Python35\python.exe F:/Python_code/sublime/Day03/str.py
//in
//writing
================
z
h
o
n
g
w
e
n

2.7.x example

The code is as follows:

#!/usr/bin/env python
# _*_ coding:utf-8 _*_

var = "Chinese"
for n in var:
    print(n)

print("================")

var2 = "zhongwen"
for n in var2:
    print(n)

results of enforcement

C:\Python27\python.exe F:/Python_code/sublime/Day03/str.py
�
�
�
�
�
�
================
z
h
o
n
g
w
e
n

As can be seen from the above example, when Python 3.5. X outputs Chinese or English, it is output according to one character, but not in Python 2.7. X. Python 2.7. X is output according to byte. It can be seen that when it outputs Chinese, it is garbled, and it also outputs six times, because in the case of UTF-8 encoding, a Chinese character is equal to Three bytes, so six characters are output.

In Python 3.5. X, you can output both Chinese characters and bytes. With the method of bytes, bytes can convert strings into bytes

var="Chinese"
for n in var:
    print(n)
    bytes_list = bytes(n, encoding='utf-8')
    # Hex output
    print(bytes_list)
    for x in bytes_list:
        # Decimal, bin(x) binary
        print(x,bin(x))

Output results

# Character string
//in
# Hexadecimal
b'\xe4\xb8\xad'
# 228 = decimal, 0b11100100 = Binary
228 0b11100100
184 0b10111000
173 0b10101101
//writing
b'\xe6\x96\x87'
230 0b11100110
150 0b10010110
135 0b10000111

b is hexadecimal, and xe4 is a hexadecimal byte

Other knowledge points

Indexes

An index is a location of a value in a list or other data type

Define a list, and view the position of Linux value in the list

>>> list_os = ["Windows","Linux","Mac","Unix"]
>>> list_os.index("Linux")
1
>>> list_os[1]
'Linux'

Use \ escape

Python allows you to escape certain characters to achieve effects that are difficult to describe simply with characters

# Commonly used content is also escaped, that is, '\ n' and '\ t', '\ n' is used to wrap lines, and '\ t' is used to replace a 'tab' key
>>> string="My \n Name  \t is"
>>> print(string)
My
 Name    is

Use + splice

You can use the + sign to concatenate multiple strings or string variables

>>> a="my "
>>> b="name "
>>> c="is "
>>> d="ansheng"
>>> a+b+c+d
'my name is ansheng'

Section

The slice operator is a sequence name followed by a square bracket with a pair of optional numbers separated by a colon. Note that this is very similar to the index operator you use. Remember that the number is optional, and the colon is required. The first number in the slice operator indicates the start position of the slice, the second number indicates where the slice ends, and the third number indicates the number of slice intervals. If you don't specify the first number, Python starts at the beginning of the sequence. If no second number is specified, Python stops at the end of the sequence. Note that the returned sequence starts at the start position and ends just before the end position. That is, the start position is included in the sequence slice, while the end position is excluded from the slice.

What I don't know in the learning process
 qun, 784758214, for python learning exchange
 There are good learning video tutorials, development tools and e-books in the group.
Share with you the current talent needs of python enterprises and how to learn python from scratch, and what to learn
>>> os="Linux"
>>> os
'Linux'
>>> os[0:2]
'Li'
>>> os[0:4:2]
'Ln'

More examples are as follows

Slicing character	Explain
[:]	Extract the entire string from the beginning to the end
[start:]	String from start to end
[:end]	Extract from beginning to end - 1
[start:end]	Extract from start to end - 1
[start : end : setp]	Extract one character per setp from start to end-1

Indexes and slices work with strings, lists, and tuples at the same time

An index is usually used to find a string or value
Slices are usually used to find strings or values in a range

Example:

# Define a list with three elements
>>> var=["Linux","Win","Unix"]
# Got a value through index
>>> var[0]
'Linux'
# Multiple values obtained by slicing
>>> var[0:2]
['Linux', 'Win']
>>> var[1:3]
['Win', 'Unix']

Posted by chaoswuz on Sat, 16 Nov 2019 04:54:29 -0800

Programmer Group