preface
Text can be seen everywhere in real life, and string is used to represent it in programming language. String is the most commonly used data type in Python. In the era of no graphical interface, almost all of them deal with strings and numbers. Later web pages and Windows applications can see the operation of strings. Also, each country has different languages, and strings are represented by different string codes. The easier it is to underestimate, the more important it is
English Vocabulary
word | Chinese interpretation | Full name |
---|---|---|
UTF-8 | A variable length character encoding for Unicode | Unicode Transformation Format(8 bits) |
builtins | Built in module | \ |
format | Format, format | \ |
separator | Separator | \ |
suffix | suffix | \ |
1, String encoding
Since the python source code is also a text file, when your source code contains Chinese, you must specify to save as UTF when saving the source code- 8 coding. When the Python interpreter reads the source code, to make it press UTF- 8 code reading, we usually write these two lines at the beginning of the file:
#!/usr/bin/python3 # -*- coding:utf-8 -*-
The first line of comment is to tell the Linux/OS X system that this is a Python executable program, and the Windows system will ignore this comment;
The second line of comments is to tell the Python interpreter to follow UTF - 8 code to read the source code, otherwise, the Chinese output you write in the source code may be garbled. I personally recommend writing these two lines in every Python file.
2, Simple use of string
2.1 print the string with print().
In Python, strings can be recognized by English (double quotation marks ") or (single quotation marks')
#!\usr\bin\python3 # -*- coding:utf-8 -*- content1 = 'hello world --- Python' content2 = "hello world --- Java" print(content1) print(content2)
2.2 display (double quotation mark ") or (single quotation mark ') in characters
Single and double quotation marks are used together
#!\usr\bin\python3 # -*- coding:utf-8 -*- content3 = "Let's go" content4 = 'Xiao Ming broke the glass playing table tennis. He is real"Very severe"' print(content3) print(content4)
Use escape characters\
#!\usr\bin\python3 # -*- coding:utf-8 -*- content5 = "C,Html,JavaScript,Css,\"Python\",Java,Markdown" content6 = 'My name is \'Hui\'!' print(content5) print(content6)
Operation results:
C,Html,JavaScript,Css,\"Python\",Java,Markdown My name is 'Hui'!
2.3 Chinese (single quotation mark ', "double quotation mark")
However, in our extensive and profound Chinese culture, (single quotation mark ', * * double quotation mark "* *) can mean
- Quotation marks can represent references
- Indicates a specific appellation
- Express special meaning
- Express irony and ridicule and highlight
content7 = ""What are you afraid of? The beauty of the sea is here! "I said" content8 = 'Modern painter Xu Beihong's horse, as some critics say, "has both form and spirit and is full of vitality".' content9 = "When they (referring to friends) maintain their "order" in prison, they tear off their mask of "civilization"." print(content7) print(content8) print(content9)
Note: escape characters are not required for Chinese (single quotation mark '') and (double quotation mark '') in the string\
2.4 operator operation string
#!\usr\bin\python3 # -*- coding:utf-8 -*- # Operator operation string print('5' + '3') # --> '53' Splicing print('--' * 20 + 'Split line' + '--' * 20) # '--' * 20 is equivalent to 20 '--' additive splicing # String accumulation result = '' for i in range(10): result += str(i) print(result) # -->'0123456789'
3, String formatting
In Python, the formatting method adopted is consistent with that of C language. It is implemented in%, as follows:
format | meaning |
---|---|
%c | Single character (integer ASCII value or character with length of 1) |
%r | String (converted by repr()) |
%s | String (converted by str()) |
%d or% i | Integer placeholder |
%u | Unsigned decimal integer |
%o | Unsigned octal integer |
%X or% X | Unsigned hexadecimal integer |
%E or% e | Exponential sign (scientific counting method) |
%F or% F | Floating point number (6 decimal places by default, rounded) |
%G or% G | If the exponent is greater than - 4 or less than - 4, the precision value is the same as% e,% e,% F,% F |
You may have guessed that the% operator is used to format a string. Inside the string,% s means to replace with a string,% d means to replace with an integer. There are several%? Placeholders followed by several variables or values in a good order. If there is only one%?, the parenthesis can be omitted. Common placeholders are:% d integer,% f floating point number,% s string,% x Hexadecimal integer.
IPython test
In [7]: # %c test In [8]: 'a%cc' % 'b' Out[8]: 'abc' In [9]: 'a%cc%c' % ('b', 'd') Out[9]: 'abcd' In [10]: 'a%cc%c' % ('b', 100) Out[10]: 'abcd' In [11]: # %r test IIn [12]: 'a%rc' % 'b' Out[12]: "a'b'c" In [13]: 'a%rc%r' % ('b', 5) Out[13]: "a'b'c5" In [14]: # %s test IIn [15]: 'a%sc' % 'b' Out[15]: 'abc' In [16]: 'a%sc%s' % ('b', 10) Out[16]: 'abc10' In [17]: 'a%sc%s' % ('b', 3.14) Out[17]: 'abc3.14' In [18]: 'a%sc%s' % ('b', 'chinese') Out[18]: 'abc chinese' # Integer test In [19]: 'num=%d' % 150 Out[19]: 'num=150' In [20]: 'num=%f' % 3.14 Out[20]: 'num=3.140000' In [21]: 'num=%d' % 3.14 Out[21]: 'num=3' In [22]: 'num=%i' % 100 Out[22]: 'num=100' In [23]: 'num=%u' % 100 Out[23]: 'num=100' In [24]: 'num=%o' % 100 Out[24]: 'num=144' In [25]: 'num=%x' % 100 Out[25]: 'num=64' In [26]: 'num=%e' % 100 Out[26]: 'num=1.000000e+02' In [27]: 'num=%g' % 100 Out[27]: 'num=100'
%r is a completely replaced string. No matter what the escape symbol or quotation mark looks like, it is replaced with. If you're not sure what to use,% s will always work, and it will convert any data type to a string
Among them, formatting integers and floating-point numbers can also specify whether to supplement 0 and specify the number of decimal places.
In [32]: 'num=%03d' % 1 Out[32]: 'num=001' In [33]: 'num=%03d' % 2 Out[33]: 'num=002' In [34]: 'num=%03d' % 10 Out[34]: 'num=010' In [35]: 'num=%03d' % 100 Out[35]: 'num=100' In [36]: 'num=%.2f' % 3.1415926 Out[36]: 'num=3.14' In [37]: 'num=%.6f' % 3.1415926 Out[37]: 'num=3.141593' In [38]: '%05.2f' % 3.1415 Out[38]: '03.14'
- 3 in% 03d means that if the string length is less than 3, 0 will be automatically added forward until the string length is 3
- 5 in% 05.2f means that if the length of the string is less than 5, 0 will be automatically added forward until the length of the string is 5. 2 means to retain two decimal places after the decimal point, and the last digit will be rounded. 3.1415 first retain two decimal places as 3.14, and then the length is 4, and 0 will be added forward
Template string
Add f to the front of the string, which is the template string. In the template string, {xxx} can be directly used to reference variables or perform corresponding operations.
IPython test
In [4]: a = 10 In [5]: b = 20 In [6]: f'num1={a}, num2={b}' Out[6]: 'num1=10, num2=20' In [7]: f'num1={a}, num2={b}, num3={a * b}' Out[7]: 'num1=10, num2=20, num3=200' In [8]: lang = 'Python' In [10]: f'language is {lang}, length={len(lang)}' Out[10]: 'language is Python, length=6' In [11]:
4, String method
Because strings are often used in programming, Python has many methods to operate strings.
4.1 dir() view all methods of str
We can use dir() of the built-in module (builtins.py) to view all the methods of a class and return a list of all the methods
Print all methods in the string
def out_demo_title(title): """ Print case title :param title :return: """ title = " " * 40 + title + " " * 40 print("-" * 100) print(title) print("-" * 100) def iter_out(iter_obj, row_num, left_just=18): """ Specify format iteration output :param iter_obj: Iteratable object to be output :param row_num: How many are output in a row :param left_just: Left aligned width :return: """ for index in range(len(iter_obj)): print(iter_obj[index].ljust(left_just), end='') if (index+1) % row_num == 0: print() print('\n') def main(): content = 'Hello World --- Python' print(content) # View all methods of the string class print(dir(str)) # I can't see everything in one line. I'm tired. Let's fine tune it title = 'str Class(%d)' % len(dir(str)) out_demo_title(title) iter_out(dir(str), row_num=5) main()
All methods of string are as follows:
----------------------------------------------------------------------------------------------- str Class(78) ----------------------------------------------------------------------------------------------- __add__ __class__ __contains__ __delattr__ __dir__ __doc__ __eq__ __format__ __ge__ __getattribute__ __getitem__ __getnewargs__ __gt__ __hash__ __init__ __init_subclass__ __iter__ __le__ __len__ __lt__ __mod__ __mul__ __ne__ __new__ __reduce__ __reduce_ex__ __repr__ __rmod__ __rmul__ __setattr__ __sizeof__ __str__ __subclasshook__ capitalize casefold center count encode endswith expandtabs find format format_map index isalnum isalpha isascii isdecimal isdigit isidentifier islower isnumeric isprintable isspace istitle isupper join ljust lower lstrip maketrans partition replace rfind rindex rjust rpartition rsplit rstrip split splitlines startswith strip swapcase title translate upper zfill
We can see that there are 78 methods in total, 33 magic methods and 45 ordinary methods
magic method is the nickname of a special method. Special methods in Python generally use naming methods such as _xxx (two underscores before and after, and the method name in the middle), such as _init _ and _ class.
Python is also an object-oriented programming language. It uses len(list) instead of list.len() to find the length of a set. Behind it is the function of a special method. It calls the list. _len_ () method, which is completely consistent with object-oriented, but also plays a role of simplification and becomes more easy to understand. This is one of the concise embodiments of Python.
Magic methods in Python, in [Python advanced column] For details, please check Magic properties in Python
4.2 use help() to view the documentation of methods and functions
def iter_out(iter_obj, row_num, left_just=18)-> (iter, int): """ Specify format iteration output :param iter_obj: Iteratable object to be output :param row_num: How many are output in a row :param left_just: Left aligned width :return: """ for index in range(len(iter_obj)): print(iter_obj[index].ljust(left_just), end='') if (index+1) % row_num == 0: print() print('\n') print(">>>use help()View document") help(iter_out) help(str.split) Output results: >>>use help()View document Help on function iter_out in module __main__: iter_out(iter_obj, row_num, left_just=18) -> (<built-in function iter>, <class 'int'>) Specify format iteration output :param iter_obj: Iteratable object to be output :param row_num: How many are output in a row :param left_just: Left aligned width :return: Help on method_descriptor: split(...) S.split(sep=None, maxsplit=-1) -> list of strings Return a list of the words in S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.
Note: when using help(), do not write the parentheses () of functions and methods, because () is a function call
4.3 common methods of string
Method name | function |
---|---|
upper(), lower() | Convert string to large and lowercase |
split() | String segmentation |
startswith(),endswith() | Comparison of front and end of string |
replace() | String substitution |
strip(),lstrip(),rstrip() | Remove spaces |
find() | Find substring in string |
String case conversion upper(), lower()
print(">>>String case conversion upper(),lower()") lowercase_letters = "a boy can do everything for girl" capital_letters = "HE IS JUST KIDDING" print(lowercase_letters.upper()) print(capital_letters.lower()) >>>String case conversion upper(),lower() A BOY CAN DO EVERYTHING FOR GIRL he is just kidding
String split()
Method parameters | meaning |
---|---|
sep | Separator of string |
maxsplit | Maxplit maximum number of splits, default - 1 |
- Maxplit is the maximum number of splits. The default is - 1. It will be cut in case of a delimiter
- Maxplit is a negative number. It will be cut when it meets the separator, and 0 will not be cut
The return value of the split() method is a list
print(">>>String cutting spilt()") content = '123#abc#admin#root' print(content.split(sep='#', maxsplit=0)) print(content.split(sep='#', maxsplit=1)) print(content.split(sep='#', maxsplit=2)) print(content.split(sep='#', maxsplit=3)) print(content.split(sep='#', maxsplit=4)) print(content.split()) # Undivided, return to list print(content.split('#')) # We are used to using it this way >>>String segmentation spilt() ['123#abc#admin#root'] ['123', 'abc#admin#root'] ['123', 'abc', 'admin#root'] ['123', 'abc', 'admin', 'root'] ['123', 'abc', 'admin', 'root'] ['123#abc#admin#root'] ['123', 'abc', 'admin', 'root']
Compare startswitch() and endswitch() before and after the string
Function: judge whether the string starts with the specified character or substring
In [14]: url = 'http://127.0.0.1/index.html' In [15]: url.startswith('http') Out[15]: True In [16]: # Match start In [17]: url.startswith('http') Out[17]: True In [18]: url.startswith('https') Out[18]: False In [19]: # Match End In [20]: url.endswith('html') Out[20]: True In [21]: url.endswith('index') Out[21]: False
5, Index, slice operation string
5.1 using index and slicing to take single characters and intercept strings
Slice syntax: object[start_index:end_index:step]
attribute | meaning |
---|---|
start_index | Start index |
end_index | End index |
step | step |
Step can be positive or negative, and the sign determines the "cutting direction". Positive means "left to right" value, and negative means "right to left" value. When step is omitted, it defaults to 1, that is, it is taken in increments of 1 from left to right
def out_demo_title(title): """ Print case title :param title :return: """ title = " " * 40 + title + " " * 40 print("-" * 100) print(title) print("-" * 100) out_demo_title("Index, slice operation string Demo") verify_codes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" print(">>>The index takes the first and last characters") print(verify_codes[0] + '\t' + verify_codes[len(verify_codes)-1] + '\n') print(">>>Slice the last character") print(verify_codes[-1:] + "\n") print(">>>Slice the first 26 lowercase letters") print(verify_codes[:26] + '\n') # --> [:26] == [0:26] print(">>>Slice 26 capital letters") print(verify_codes[26:52] + '\n') print(">>>10 digits after slicing") print(verify_codes[-10:] + '\n') # [-10:]==[len(verify_codes)-10:len(verify_codes)] print(">>>Slice all the preceding letters") print(verify_codes[:-10] + '\n')
5.2 for loop traversal string
def iter_out(iter_obj, row_num, left_just=18)-> (iter, int): """ Specify format iteration output :param iter_obj: Iteratable object to be output :param row_num: How many are output in a row :param left_just: Left aligned width """ for index in range(len(iter_obj)): print(iter_obj[index].ljust(left_just), end='') if (index+1) % row_num == 0: print('\n') verify_codes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" # for loop traversal string for char in verify_codes: print(char) print('>>>for Loop through string') iter_out(verify_codes, row_num=10, left_just=11)
Output results:
>>>for Loop through string a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9
With this ITER_ Does the function defined by out () print much better