String in Python - str

preface

Text can be seen everywhere in real life, and string is used to represent it in programming language. String is the most commonly used data type in Python. In the era of no graphical interface, almost all of them deal with strings and numbers. Later web pages and Windows applications can see the operation of strings. Also, each country has different languages, and strings are represented by different string codes. The easier it is to underestimate, the more important it is

English Vocabulary

word

Chinese interpretation

Full name

UTF-8

A variable length character encoding for Unicode

Unicode Transformation Format(8 bits)

builtins

Built in module

\

format

Format, format

\

separator

Separator

\

suffix

suffix

\

1, String encoding

Since the python source code is also a text file, when your source code contains Chinese, you must specify to save as UTF when saving the source code- ­ 8 coding. When the Python interpreter reads the source code, to make it press UTF- ­ 8 code reading, we usually write these two lines at the beginning of the file:

#!/usr/bin/python3
# -*- coding:utf-8 -*-

The first line of comment is to tell the Linux/OS X system that this is a Python executable program, and the Windows system will ignore this comment;

The second line of comments is to tell the Python interpreter to follow UTF ­- 8 code to read the source code, otherwise, the Chinese output you write in the source code may be garbled. I personally recommend writing these two lines in every Python file.

2, Simple use of string

2.1 print the string with print().

In Python, strings can be recognized by English (double quotation marks ") or (single quotation marks')

#!\usr\bin\python3
# -*- coding:utf-8 -*- 

content1 = 'hello world --- Python'
content2 = "hello world --- Java"

print(content1)
print(content2)

2.2 display (double quotation mark ") or (single quotation mark ') in characters

Single and double quotation marks are used together

#!\usr\bin\python3
# -*- coding:utf-8 -*- 

content3 = "Let's go"
content4 = 'Xiao Ming broke the glass playing table tennis. He is real"Very severe"'
	
print(content3)
print(content4)

Use escape characters\

#!\usr\bin\python3
# -*- coding:utf-8 -*- 

content5 = "C,Html,JavaScript,Css,\"Python\",Java,Markdown"
content6 = 'My name is \'Hui\'!'

print(content5)
print(content6)

Operation results:

C,Html,JavaScript,Css,\"Python\",Java,Markdown

My name is 'Hui'!

2.3 Chinese (single quotation mark ', "double quotation mark")

However, in our extensive and profound Chinese culture, (single quotation mark ', * * double quotation mark "* *) can mean

  1. Quotation marks can represent references
  2. Indicates a specific appellation
  3. Express special meaning
  4. Express irony and ridicule and highlight
content7 = ""What are you afraid of? The beauty of the sea is here! "I said"
content8 = 'Modern painter Xu Beihong's horse, as some critics say, "has both form and spirit and is full of vitality".'
content9 = "When they (referring to friends) maintain their "order" in prison, they tear off their mask of "civilization"."

print(content7)
print(content8)
print(content9)

Note: escape characters are not required for Chinese (single quotation mark '') and (double quotation mark '') in the string\

2.4 operator operation string

#!\usr\bin\python3
# -*- coding:utf-8 -*- 

# Operator operation string
print('5' + '3')    # --> '53' 	 Splicing
print('--' * 20 + 'Split line' + '--' * 20)	# '--' * 20 is equivalent to 20 '--' additive splicing

# String accumulation
result = ''
for i in range(10):
    result += str(i)
print(result)   # -->'0123456789'

3, String formatting

In Python, the formatting method adopted is consistent with that of C language. It is implemented in%, as follows:

format

meaning

%c

Single character (integer ASCII value or character with length of 1)

%r

String (converted by repr())

%s

String (converted by str())

%d or% i

Integer placeholder

%u

Unsigned decimal integer

%o

Unsigned octal integer

%X or% X

Unsigned hexadecimal integer

%E or% e

Exponential sign (scientific counting method)

%F or% F

Floating point number (6 decimal places by default, rounded)

%G or% G

If the exponent is greater than - 4 or less than - 4, the precision value is the same as% e,% e,% F,% F

You may have guessed that the% operator is used to format a string. Inside the string,% s means to replace with a string,% d means to replace with an integer. There are several%? Placeholders followed by several variables or values in a good order. If there is only one%?, the parenthesis can be omitted. Common placeholders are:% d integer,% f floating point number,% s string,% x Hexadecimal integer.

IPython test

In [7]: # %c test

In [8]: 'a%cc' % 'b'
Out[8]: 'abc'

In [9]: 'a%cc%c' % ('b', 'd')
Out[9]: 'abcd'

In [10]: 'a%cc%c' % ('b', 100)
Out[10]: 'abcd'
    
In [11]: # %r test

IIn [12]: 'a%rc' % 'b'
Out[12]: "a'b'c"

In [13]: 'a%rc%r' % ('b', 5)
Out[13]: "a'b'c5"

In [14]: # %s test

IIn [15]: 'a%sc' % 'b'
Out[15]: 'abc'

In [16]: 'a%sc%s' % ('b', 10)
Out[16]: 'abc10'

In [17]: 'a%sc%s' % ('b', 3.14)
Out[17]: 'abc3.14'

In [18]: 'a%sc%s' % ('b', 'chinese')
Out[18]: 'abc chinese'    
    
# Integer test    
In [19]: 'num=%d' % 150
Out[19]: 'num=150'

In [20]: 'num=%f' % 3.14
Out[20]: 'num=3.140000'

In [21]: 'num=%d' % 3.14
Out[21]: 'num=3'

In [22]: 'num=%i' % 100
Out[22]: 'num=100'

In [23]: 'num=%u' % 100
Out[23]: 'num=100'

In [24]: 'num=%o' % 100
Out[24]: 'num=144'

In [25]: 'num=%x' % 100
Out[25]: 'num=64'

In [26]: 'num=%e' % 100
Out[26]: 'num=1.000000e+02'

In [27]: 'num=%g' % 100
Out[27]: 'num=100'    

%r is a completely replaced string. No matter what the escape symbol or quotation mark looks like, it is replaced with. If you're not sure what to use,% s will always work, and it will convert any data type to a string

Among them, formatting integers and floating-point numbers can also specify whether to supplement 0 and specify the number of decimal places.

In [32]: 'num=%03d' % 1
Out[32]: 'num=001'

In [33]: 'num=%03d' % 2
Out[33]: 'num=002'

In [34]: 'num=%03d' % 10
Out[34]: 'num=010'

In [35]: 'num=%03d' % 100
Out[35]: 'num=100'

In [36]: 'num=%.2f' % 3.1415926
Out[36]: 'num=3.14'

In [37]: 'num=%.6f' % 3.1415926
Out[37]: 'num=3.141593'
    
In [38]: '%05.2f' % 3.1415
Out[38]: '03.14'
  • 3 in% 03d means that if the string length is less than 3, 0 will be automatically added forward until the string length is 3
  • 5 in% 05.2f means that if the length of the string is less than 5, 0 will be automatically added forward until the length of the string is 5. 2 means to retain two decimal places after the decimal point, and the last digit will be rounded. 3.1415 first retain two decimal places as 3.14, and then the length is 4, and 0 will be added forward

Template string

Add f to the front of the string, which is the template string. In the template string, {xxx} can be directly used to reference variables or perform corresponding operations.

IPython test

In [4]: a = 10

In [5]: b = 20

In [6]: f'num1={a}, num2={b}'
Out[6]: 'num1=10, num2=20'

In [7]: f'num1={a}, num2={b}, num3={a * b}'
Out[7]: 'num1=10, num2=20, num3=200'

In [8]: lang = 'Python'

In [10]: f'language is {lang}, length={len(lang)}'
Out[10]: 'language is Python, length=6'

In [11]:

4, String method

Because strings are often used in programming, Python has many methods to operate strings.

4.1 dir() view all methods of str

We can use dir() of the built-in module (builtins.py) to view all the methods of a class and return a list of all the methods

Print all methods in the string

def out_demo_title(title):
    """
    Print case title
    :param title
    :return:
    """
    title = " " * 40 + title + " " * 40
    print("-" * 100)
    print(title)
    print("-" * 100)
    
    
def iter_out(iter_obj, row_num, left_just=18):
    """
    Specify format iteration output
    :param iter_obj: Iteratable object to be output
    :param row_num: How many are output in a row
    :param left_just: Left aligned width
    :return:
    """
    for index in range(len(iter_obj)):
        print(iter_obj[index].ljust(left_just), end='')
        if (index+1) % row_num == 0:
            print()
            print('\n')


def main():
    content = 'Hello World --- Python'
    print(content)

    # View all methods of the string class
    print(dir(str))

    # I can't see everything in one line. I'm tired. Let's fine tune it
    title = 'str Class(%d)' % len(dir(str))
    out_demo_title(title)
    iter_out(dir(str), row_num=5)
  	

main()

All methods of string are as follows:

-----------------------------------------------------------------------------------------------
                                        str Class(78)
-----------------------------------------------------------------------------------------------
__add__           __class__         __contains__      __delattr__       __dir__


__doc__           __eq__            __format__        __ge__            __getattribute__


__getitem__       __getnewargs__    __gt__            __hash__          __init__


__init_subclass__ __iter__          __le__            __len__           __lt__


__mod__           __mul__           __ne__            __new__           __reduce__


__reduce_ex__     __repr__          __rmod__          __rmul__          __setattr__


__sizeof__        __str__           __subclasshook__  capitalize        casefold


center            count             encode            endswith          expandtabs


find              format            format_map        index             isalnum


isalpha           isascii           isdecimal         isdigit           isidentifier


islower           isnumeric         isprintable       isspace           istitle


isupper           join              ljust             lower             lstrip


maketrans         partition         replace           rfind             rindex


rjust             rpartition        rsplit            rstrip            split


splitlines        startswith        strip             swapcase          title


translate         upper             zfill

We can see that there are 78 methods in total, 33 magic methods and 45 ordinary methods

magic method is the nickname of a special method. Special methods in Python generally use naming methods such as _xxx (two underscores before and after, and the method name in the middle), such as _init _ and _ class.

Python is also an object-oriented programming language. It uses len(list) instead of list.len() to find the length of a set. Behind it is the function of a special method. It calls the list. _len_ () method, which is completely consistent with object-oriented, but also plays a role of simplification and becomes more easy to understand. This is one of the concise embodiments of Python.

Magic methods in Python, in [Python advanced column] For details, please check Magic properties in Python

4.2 use help() to view the documentation of methods and functions

def iter_out(iter_obj, row_num, left_just=18)-> (iter, int):
    """
    Specify format iteration output
    :param iter_obj: Iteratable object to be output
    :param row_num: How many are output in a row
    :param left_just: Left aligned width
    :return:
    """
    for index in range(len(iter_obj)):
        print(iter_obj[index].ljust(left_just), end='')
        if (index+1) % row_num == 0:
            print()
            print('\n')
            
print(">>>use help()View document")
help(iter_out)
help(str.split)

Output results:
>>>use help()View document
Help on function iter_out in module __main__:

iter_out(iter_obj, row_num, left_just=18) -> (<built-in function iter>, <class 'int'>)
    Specify format iteration output
    :param iter_obj: Iteratable object to be output
    :param row_num: How many are output in a row
    :param left_just: Left aligned width
    :return:

Help on method_descriptor:

split(...)
    S.split(sep=None, maxsplit=-1) -> list of strings
    
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.

Note: when using help(), do not write the parentheses () of functions and methods, because () is a function call

4.3 common methods of string

Method name

function

upper(), lower()

Convert string to large and lowercase

split()

String segmentation

startswith(),endswith()

Comparison of front and end of string

replace()

String substitution

strip(),lstrip(),rstrip()

Remove spaces

find()

Find substring in string

String case conversion upper(), lower()

print(">>>String case conversion upper(),lower()")

lowercase_letters = "a boy can do everything for girl"
capital_letters = "HE IS JUST KIDDING"

print(lowercase_letters.upper())
print(capital_letters.lower())

>>>String case conversion upper(),lower()
A BOY CAN DO EVERYTHING FOR GIRL
he is just kidding

String split()

Method parameters

meaning

sep

Separator of string

maxsplit

Maxplit maximum number of splits, default - 1

  • Maxplit is the maximum number of splits. The default is - 1. It will be cut in case of a delimiter
  • Maxplit is a negative number. It will be cut when it meets the separator, and 0 will not be cut

The return value of the split() method is a list

print(">>>String cutting spilt()")
content = '123#abc#admin#root'

print(content.split(sep='#', maxsplit=0))
print(content.split(sep='#', maxsplit=1))
print(content.split(sep='#', maxsplit=2))
print(content.split(sep='#', maxsplit=3))
print(content.split(sep='#', maxsplit=4))

print(content.split())  # Undivided, return to list

print(content.split('#'))   # We are used to using it this way

>>>String segmentation spilt()
['123#abc#admin#root']
['123', 'abc#admin#root']
['123', 'abc', 'admin#root']
['123', 'abc', 'admin', 'root']
['123', 'abc', 'admin', 'root']
['123#abc#admin#root']
['123', 'abc', 'admin', 'root']

Compare startswitch() and endswitch() before and after the string

Function: judge whether the string starts with the specified character or substring

In [14]: url = 'http://127.0.0.1/index.html'

In [15]: url.startswith('http')
Out[15]: True

In [16]: # Match start

In [17]: url.startswith('http')
Out[17]: True

In [18]: url.startswith('https')
Out[18]: False

In [19]: # Match End 

In [20]: url.endswith('html')
Out[20]: True

In [21]: url.endswith('index')
Out[21]: False

5, Index, slice operation string

5.1 using index and slicing to take single characters and intercept strings

Slice syntax: object[start_index:end_index:step]

attribute

meaning

start_index

Start index

end_index

End index

step

step

Step can be positive or negative, and the sign determines the "cutting direction". Positive means "left to right" value, and negative means "right to left" value. When step is omitted, it defaults to 1, that is, it is taken in increments of 1 from left to right

def out_demo_title(title):
    """
    Print case title
    :param title
    :return:
    """
    title = " " * 40 + title + " " * 40
    print("-" * 100)
    print(title)
    print("-" * 100)
    
out_demo_title("Index, slice operation string Demo")
verify_codes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

print(">>>The index takes the first and last characters")
print(verify_codes[0] + '\t' + verify_codes[len(verify_codes)-1] + '\n')

print(">>>Slice the last character")
print(verify_codes[-1:] + "\n")

print(">>>Slice the first 26 lowercase letters")
print(verify_codes[:26] + '\n')    # --> [:26] == [0:26]

print(">>>Slice 26 capital letters")
print(verify_codes[26:52] + '\n')

print(">>>10 digits after slicing")

print(verify_codes[-10:] + '\n')  # [-10:]==[len(verify_codes)-10:len(verify_codes)] 

print(">>>Slice all the preceding letters")
print(verify_codes[:-10] + '\n')

5.2 for loop traversal string

def iter_out(iter_obj, row_num, left_just=18)-> (iter, int):
    """
    Specify format iteration output
    :param iter_obj: Iteratable object to be output
    :param row_num: How many are output in a row
    :param left_just: Left aligned width
    """
    for index in range(len(iter_obj)):
        print(iter_obj[index].ljust(left_just), end='')
        if (index+1) % row_num == 0:
            print('\n')
            
verify_codes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

# for loop traversal string
for char in verify_codes:
	print(char)

print('>>>for Loop through string')
iter_out(verify_codes, row_num=10, left_just=11)

Output results:

>>>for Loop through string
a        b        c        d        e        f        g        h        i        j        
k        l        m        n        o        p        q        r        s        t        
u        v        w        x        y        z        A        B        C        D        
E        F        G        H        I        J        K        L        M        N        
O        P        Q        R        S        T        U        V        W        X        
Y        Z        0        1        2        3        4        5        6        7        
8        9

With this ITER_ Does the function defined by out () print much better

Posted by pmjm1 on Mon, 06 Dec 2021 18:49:03 -0800