Exception capture and regular expression usage in python

Keywords: Python ascii Mobile less

Exception capture

1. What is exception capture

"""
The program can continue to run if there is an exception in the process of program execution.

Note: you don't need to use exception capture all the time. You should use it when you know that there may be an exception in a certain place and you don't want to crash the program
"""

2. Syntax of exception capture

"""
Syntax 1: get all exceptions
try:
    Code snippet 1
except:
    Code snippet 2

    
explain:
try, except - keyword, fixed method
 : - Fixed writing
 Code snippet 1 - one or more statements that keep an indent with try; code that may have exceptions (code that needs to catch exceptions)
Snippet 2 - exception caught, code to execute

Implementation process:
Code segment 1 is executed first. If there is an exception in the execution process, code segment 2 will be executed directly; if there is no exception in code segment 1, code segment 2 will not be executed
"""
#Exercise: enter the age. If you enter it incorrectly, you will be prompted to enter the wrong information
# try:
#age = int(input('Please enter age: ')
#print('input completed! )
# except:
#print('wrong age input! )
#
# print('===================')

try:
    #age = int(input('Please enter age: ')
    print({'name':'zhang San '} ['age'])
    print([1, 2, 3][10])
except:
    print('exception occurred ')
"""
Syntax 2: capture the exception of the specified type
try:
    Code snippet 1
 Exception except ion type:
    Code snippet 2
    
Execution process: execute code segment 1 first. If there is an exception in code segment 1, judge whether the type of exception occurred is consistent with the exception type after exception. If consistent program does not crash, code segment 2 will be executed directly, and inconsistent program will crash directly.
"""
# [10, 20][100]   # IndexError
 #{name ':'zhang San'} ['age '] # keyerror
try:
    print({'name':'zhang San '} ['age'])
    print([10, 20][100])
except KeyError:
    print('exception! )

"""
Syntax 3: capture multiple exceptions at the same time, and do the same processing for different exceptions
try:
    Code snippet 1
 Exception (exception type 1, exception type 2,...)
    Code snippet 2
    
    
Syntax 4: capture multiple exceptions at the same time, and do different processing for different exceptions
try:
    Code snippet 1
 Exception except ion type 1:
    Code snippet 11
 Exception except ion type 2:
    Code snippet 22
 Exception except ion type 3:
    Code snippet 33
...

"""

3.finally keyword

"""
finally keyword can be added after all syntax structures of exception capture:
try:
    Code snippet 1
except:
    Code snippet 2
finally:
    Code snippet 3
    
    
Code segment 3 is executed regardless of what happens (snippet 1, no matter what happens)
"""
# try:
#     print([10, 20, 30][100])
#     print('=======')
# except KeyError:
#print('exception caught ')
# finally:
#Write a suicide note! )
#print('Other statements')


def func1():
    try:
        print('==========')
        return 100
    except:
        print('exception caught ')
    finally:
        print('write a suicide note 2! )


print(func1())

regular expression

1. What is a regular expression

"""
Regular expression is a tool to make string processing easier (string matching in essence)
"""

2. Syntax of regular expression

"""
fullmatch(regular expression , character string)  - Let the regular expression match the string exactly. If the match fails, the result is None
js Regularity of: /regular expression /
python Regularity of: r'regular expression '
"""
print('=====================1.Matching symbol=====================')
# 1) Normal character - represents the character itself
re_str = r'abc'
result = fullmatch(re_str, 'abc')
print(result)

# 2) . - match an arbitrary character
re_str = r'.abc'
result = fullmatch(re_str, '+abc')
print(result)

# Match a string of length 5. The middle three characters of the string are abc. The first and last characters are arbitrary
re_str = r'.abc.'
result = fullmatch(re_str, '(abc+')
print(result)

# Matches an arbitrary string of length 3
re_str = r'...'
result = fullmatch(re_str, 'ajs')
print(result)

# 3)\w - matches any number, letter or underline (valid for ASCII code table) (Note: not normally used)
re_str = r'\wabc'
result = fullmatch(re_str, '8abc')
print(result)

# 4)\d - matches any numeric character
re_str = r'\d\d\d'
result = fullmatch(re_str, '142')
print(result)

re_str = r'\d\dabc\d\d'
result = fullmatch(re_str, '23abc89')
print(result)

# 5)\s - matches any white space character
re_str = r'\s\d..'
result = fullmatch(re_str, '\n9k/')
print(result)

re_str = r'\d\d\s\d\d'
result = fullmatch(re_str, '78 23')
print(result)

# 6) , and
"""
\letter  - The function of lower case letters is opposite to that of corresponding capital letters
"""
re_str = r'\dabc\D'
result = fullmatch(re_str, '8abc-')
print(result)

re_str = r'\Sabc'
result = fullmatch(re_str, '=abc')
print(result)

# 7) [character set] - matches any character in the character set
"""
//Note: a [] can only match one character
a.
[abc123]  -  matching a,b,c,1,2,3 Any character in
b.
[a-z]    -   Match from character a To character z Any character between(Match any lowercase letter)
[A-Z]    -   Match any capital letter
[a-zA-Z] -   Match any letter
[0-9]    -   Match any numeric character
[\u4e00-\u9fa5]   - Match any Chinese character
"""
re_str = r'[cz+?]123'
result = fullmatch(re_str, '?123')
print(result)

re_str = r'[\u4e00-\u9fa5]123'
result = fullmatch(re_str, 'Look at 123')
print(result)

# Exercise: judge whether the input mobile phone number is legal
re_str = r'1[3-9]\d\d\d\d\d\d\d\d\d'
result = fullmatch(re_str, '13598902763')
print(result)

# The character code value before - in [] must be less than the character code after -
# result = fullmatch(r'[a-0]abc', '0abc')

# If it is not between two characters, there is no special function to express itself directly
result = fullmatch(r'[-09]abc', '-abc')
print(result)

# Or, abc: a string can be preceded by an underline
re_str = r'[a-zA-Z\d_]abc'
result = fullmatch(re_str, '_abc')
print(result)

# 8) [^ character set] - any character not in character set
"""
[^\u4e00-\u9fa5]   - Match any non Chinese character
[^0-9]  -  Matches any non numeric character
[^a-zA-Z]  - Matches any non alphabetic character
"""
print(fullmatch(r'[abc^]123', 'b123'))
print(fullmatch(r'[^abc]123', 'a123'))

print('=====================2.Detection symbol======================')
# 1) \ b - detects whether it is the boundary of a word
"""
//Word boundary: the beginning of a string, the end of a string, any symbol that can distinguish two different words
//Note: the symbol of the detection class does not affect the length of the match, but only makes further detection when the matching is successful
"""
message = 'how are you?i am fine!thank you!'
re_str = r'\d\d.\b\d\d'
print(fullmatch(re_str, '56=89'))

# 2) ^ - detects if ^ is at the beginning of a string
re_str = r'\d^abc'
print(fullmatch(re_str, '1abc'))

re_str = r'^\d\d\d'
print(fullmatch(re_str, '678'))

re_str = r'^\d\d\d'
print(fullmatch(re_str, '678'))
print(search(re_str, 'shdj39 Geek time 238 u 282='))

# 3) $- detects whether the position of $is at the end of a string
re_str = r'\d\d$'
print(search(re_str, 'Times peak 78 hint method 23 sofa 89'))

print('=======================3.Matching times==================')
# 1) * - match 0 or more times
"""
//Character * - character 0 or more times
"""
re_str = r'a*'
print(fullmatch(re_str, 'aaa'))

re_str = r'\d*'
print(fullmatch(re_str, '478923'))

re_str = r'123[a-z]*'
print(fullmatch(re_str, '123ukl'))

# 2) + - match 1 or more times (at least 1 time)
re_str = r'a+'
print(fullmatch(re_str, 'a'))

# 3) ? - match 0 or 1 times
re_str = r'\d?abc'
print(fullmatch(re_str, '0abc'))

# Exercise: writing a regular expression can match any integer string
# '23874', '-234', '+2348977'
re_str = r'[-+]?\d+'
print(fullmatch(re_str, '+23874'))

# 4){}
"""
{N}   -  matching N second
{M,N}  - matching M reach N second
{M,}   - Match at least M second
{,N}   - Most matches N Times (0~N Times)
"""
re_str = r'\d{4}abc'
print(fullmatch(re_str, '6723abc'))

re_str = r'a{2,5}123'
print(fullmatch(re_str, 'aaaaa123'))

re_str = r'a{2,}123'
print(fullmatch(re_str, 'aaaaaaaaaaa123'))

re_str = r'a{,2}123'
print(fullmatch(re_str, 'aa123'))

# 5) Greed and non greed
"""
//In the case of uncertain matching times, there are two matching patterns: greedy and non greedy
a.Greedy: greedy by default (match as many times as possible on the premise of matching)
        *,+,?, {M,N},{M,},{,N}

b.Non Greed:(On the premise of matching, the matching times should be as few as possible)When the matching times are uncertain, a question mark is added after the number of times, and the matching is non greedy
      *?,+?,??,{M,N}?,{M,}?,{,N}?
"""
re_str = r'\d{2,}'
print(search(re_str, 'Nurse 227382 abc Hello!'))

re_str = r'\d{2,}?'
print(search(re_str, 'Nurse 227382 abc Hello!'))

# Exercise: get the names of all countries in the epidemic information
content = read_file('data.json')
re_str = r'"provinceName":"(.+?)",'
print(findall(re_str, content))

Branching and grouping

1. Branch:|

# Regular, regular, regular
# Write a regular to match a string: 123abc and 456abc
re_str = r'123abc|456abc'
print(fullmatch(re_str, '456abc'))

re_str = r'123|345abc'
print(fullmatch(re_str, '123abc'))   # None
print(fullmatch(re_str, '123'))
print(fullmatch(re_str, '345abc'))

2. Grouping: ()

"""
1)Overall operation
"""
# abc appeared three times
re_str = r'(abc){3}'
print(fullmatch(re_str, 'abcabcabc'))

# Write a regular to match a string: 123abc and 456abc
re_str = r'(123|456)abc'
print(fullmatch(re_str, '456abc'))

# The structure of two numbers and two letters is repeated four times: 34hj56kl67uj23Bm
re_str = r'(\d\d[a-zA-Z]{2}){4}'
print(fullmatch(re_str, '34hj56kl67uj23Bm'))

"""
2)repeat: 
\M  - Repeat the previous paragraph M Content matched to groups(M Start with 1)
"""
re_str = r'(\d\d)=\1abc'
print(fullmatch(re_str, '67=67abc'))

re_str = r'(\d\d)-([a-z]{3})-\2-\1'
print(fullmatch(re_str, '23-bnm-bnm-23'))

"""
3)capture
findall
"""

3. Escape character

re_str = r'\.\d\d'
print(fullmatch(re_str, '.23'))

re_str = r'abc\+\d\d'
print(fullmatch(re_str, 'abc+34'))

# Note: independent symbols with special functions will automatically disappear in []
re_str = r'[-+.]abc'
print(fullmatch(re_str, '.abc'))

Posted by rsmarsha on Mon, 29 Jun 2020 18:35:21 -0700