Common methods of python string parsing

Keywords: Linux Python SELinux Database

This is some of the common methods I summarized in my study of python about strings.
In this paper, the built-in help document of Python version 3.5 is quoted, roughly translated, and several small experiments are added.

isalnum

Return True when all characters in the S. isalnum () - > bool # string are letters or numbers, or False if not.

Return True if all characters in S are alphanumeric and there is
at least one character in S, False otherwise.

>>> str1="hello world"
>>> str2="hello555 world"
>>> str3="66666"
>>> str4="hello"
>>> str1.isalnum()
False
>>> str2.isalnum()
False
>>> str3.isalnum()
True
>>> str4.isalnum()
True

isalpha

Returns True when all characters in the S. isalpha () - > bool # string are letters, or False if not.

Return True if all characters in S are alphabetic and there is
at least one character in S, False otherwise.

>>> str1="hello world"
>>> str2="hello555 world"
>>> str3="66666"
>>> str4="hello"
>>> str1.isalpha()
False
>>> str2.isalpha()
False
>>> str3.isalpha()
False
>>> str4.isalpha()
True

isdigit

S. isDigit () - > bool # strings where all characters are numbers return True or False

Return True if all characters in S are digits and there is at
least one character in S, False otherwise.

>>> str1="hello world"
>>> str2="hello555 world"
>>> str3="66666"
>>> str1.isdigit()
False
>>> str2.isdigit()
False
>>> str3.isdigit()
True

islower

S. islower () - > bool # string returns True when all characters are lowercase letters, otherwise False is returned

Return True if all cased characters in S are lowercase and there is
at least one cased character in S, False otherwise.

>>> str1="hello world"
>>> str2="66666"
>>> str3="HELLO WORLD"
>>> str1.islower()
True
>>> str2.islower()
False
>>> str3.islower()
False

istitle

S. istitle () - > bool # Returns True when the initial letter of each word is capitalized, otherwise False is returned.

Return True if S is a titlecased string and there is at least one
character in S, i.e. upper- and titlecase characters may only
follow uncased characters and lowercase characters only cased ones.
Return False otherwise.

>>> str1="hello world"
>>> str2="Hello World"
>>> str3="HELLO WORLD"
>>> str1.istitle()
False
>>> str2.istitle()
True
>>> str3.istitle()
False

isupper

Return True when all characters in the S. isupper () - > bool # string are capital letters, otherwise return False

Return True if all cased characters in S are uppercase and there is
at least one cased character in S, False otherwise.

>>> str1="hello world"
>>> str2="66666"
>>> str3="HELLO WORLD"
>>> str1.isupper()
False
>>> str2.isupper()
False
>>> str3.isupper()
True

lower

S. lower () - > str # Converts all characters in a string to lowercase letters

Return a copy of the string S converted to lowercase.

>>> str3="HELLO WORLD"
>>> str3.lower()
'hello world'

upper

S. upper () - > str # Converts all characters in a string to uppercase letters

Return a copy of S converted to uppercase.

>>> str1="hello world"
>>> str1.upper()
'HELLO WORLD'

strip

S. strip ([chars]) - > str # removes spaces in strings

Return a copy of the string S with leading and trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.

>>> str1="     hello world      "
>>> str2="hello world       "
>>> str1.strip()
'hello world'
>>> str2.strip()
'hello world'

isspace

Return True when all characters in the S. isspace (-> bool) string are spaces, or False

Return True if all characters in S are whitespace
and there is at least one character in S, False otherwise.

>>> str1="             "
>>> str2="hello world"
>>> str1.isspace()
True
>>> str2.isspace()
False

replace

S. replace (old, new [, count]) - > str # replaces the specified character in the string with a new character

Return a copy of S with all occurrences of substring
old replaced by new. If the optional argument count is
given, only the first count occurrences are replaced.

>>> str1="hello world"
>>> str1.replace("l","L")
'heLLo worLd'
>>> str2="abababababababab"
>>> str2.replace("a","c")
'cbcbcbcbcbcbcbcb'

index

S. index (sub [, start [, end]) - > int # returns the index of the substrings in the string

Like S.find() but raise ValueError when the substring is not found.

>>> str1="abcdefg"
>>> str1.index("a")
0
>>> str1.index("f")
5

find

S.find (sub [, start [, end]) - > int # Finds the specified substring in the string, returns - 1 if not found, and returns the index value of the substring in the string if found.

Return the lowest index in S where substring sub is found,
such that sub is contained within S[start:end]. Optional
arguments start and end are interpreted as in slice notation.
Return -1 on failure.

>>> str1="abcdefg"
>>> str2="ababab"
>>> str1.find("bc")
1
>>> str2.find("b")
1
>>> str1.find("f")
5

split

S. split (sep = None, maxsplit = - 1) - > list of strings # Separates strings by specified symbols

Return a list of the words in S, using sep as the
delimiter string. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator and empty strings are
removed from the result.

>>> str1="/etc/sysconfig/selinux"
>>> str1.split("/")
['', 'etc', 'sysconfig', 'selinux']
>>> str2="abc|mnt|xyz"
>>> str2.split("|")
['abc', 'mnt', 'xyz']

startswith

S. startswith (prefix [, start [, end]) - > bool # string returns True at the beginning of the specified character, or False

Return True if S starts with the specified prefix, False otherwise.
With optional start, test S beginning at that position.
With optional end, stop comparing S at that position.
prefix can also be a tuple of strings to try.

>>> str1="hello world"
>>> str2="abcdefg"
>>> str1.startswith("hello")
True
>>> str2.startswith("abc")
True

endswith

S. endswith (suffix [, start [, end]) - > bool # string returns True at the end of the specified character, or False

Return True if S ends with the specified suffix, False otherwise.
With optional start, test S beginning at that position.
With optional end, stop comparing S at that position.
suffix can also be a tuple of strings to try.

>>> str1="hello world"
>>> str2="abcdefg"
>>> str1.startswith("hello")
True
>>> str2.startswith("abc")
True
>>> str1.endswith("ld")
True
>>> str2.endswith("fg")
True

lstrip

S. lstrip ([chars]) - > str # Only removes the space on the left side of the string

Return a copy of the string S with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.

>>> str1="           hello world         "
>>> str2="            hello world"
>>> str3="hello world            "
>>> str1.lstrip()
'hello world         '
>>> str2.lstrip()
'hello world'
>>> str3.lstrip()
'hello world            '

rstrip

S. rstrip ([chars]) - > str # removes only the right space of the string

Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.

>>> str1="           hello world         "
>>> str2="            hello world"
>>> str3="hello world            "
>>> str1.rstrip()
'           hello world'
>>> str2.rstrip()
'            hello world'
>>> str3.rstrip()
'hello world'

rfind

S.rfind (sub [, start [, end]) - > int # returns the index value of the substring in the string

Return the highest index in S where substring sub is found,
such that sub is contained within S[start:end]. Optional
arguments start and end are interpreted as in slice notation.

>>> str1="hello world"
>>> str2="abcdefg"
>>> str1.rfind("world")
6
>>> str1.rfind("r")
8
>>> str2.rfind("e")
4
>>> str2.rfind("g")
6

format

S. format (* args, ** kwargs) - > str # formats output the specified string

Return a formatted version of S, using substitutions from args and kwargs.
The substitutions are identified by braces ('{' and '}').

>>> print("{name}======>{age}".format(name="tom",age=22))
tom=========>22
>>> print("{name}======>{age}".format(age=22,name="tom"))
tom=========>22 

swapcase

S. swapcase () - > str # converts lower case letters into upper case and upper case letters into lower case

Return a copy of S with uppercase characters converted to lowercase
and vice versa.

>>> str1="HELLO world"
>>> str1.swapcase()
'hello WORLD'
>>> str2="hello WORLD"
>>> str2.swapcase()
'HELLO world'

title

S. Title () - > str # turns every word in a string into a capital letter at the beginning

Return a titlecased version of S, i.e. words start with title case
characters, all remaining cased characters have lower case.

>>> str1="hello world"
>>> str1.title()
'Hello World'
>>> str2="this is a test string"
>>> str2.title()
'This Is A Test String'

join

S. join (iterable) - > str # inserts a string mix into an iterator

Return a string which is the concatenation of the strings in the
iterable. The separator between elements is S.

>>> str1="abcd"
>>> str2="xyz"
>>> str1.join(str2)
'xabcdyabcdz'

capitalize

S. capitalize () - > str # capitalizes the initial letter in the string

Return a capitalized version of S, i.e. make the first character
have upper case and the rest lower case.

>>> str1="hello world"
>>> str1.capitalize()
'Hello world'
>>> str1="linux"
>>> str1.capitalize()
'Linux'

center

S. center (width [, fillchar]) - > str # expands the string to the specified length, if not enough, then fills it with the character of the second parameter, defaulting to space

Return S centered in a string of length width. Padding is
done using the specified fill character (default is a space)

>>> str1="hello world"
>>> str1.center(30,"#")
'############linux#############'
>>> str2="linux"
>>> str2.center(16,"*")
'*****linux******'

count

S. count (sub [, start [, end]) - > int # to determine the number of occurrences of substrings in a string

Return the number of non-overlapping occurrences of substring sub in
string S[start:end]. Optional arguments start and end are
interpreted as in slice notation.

>>> str1="hello world"
>>> str1.count("l")
3
>>> str2="aaaaaaaaaaaa"
>>> str2.count("a")
12

ljust

S.ljust (width [, fillchar]) - > str # expands the string to the specified length, and if it is insufficient, the second parameter is filled from the right, defaulting to a space.

Return S left-justified in a Unicode string of length width. Padding is
done using the specified fill character (default is a space).

>>> str1="hello world"
>>> str1.ljust(20,"@")
'hello world@@@@@@@@@'
>>> str2="linux"
>>> str2.ljust(18,"&")
'linux&&&&&&&&&&&&&'

rjust

S.rjust (width [, fillchar]) - > str # expands the string to the specified length, not enough to be filled by the second parameter from the left, default to space

Return S right-justified in a string of length width. Padding is
done using the specified fill character (default is a space).

>>> str1="hello world"
>>> str2="linux"
>>> str1.rjust(30,"#")
'###################hello world'
>>> str2.rjust(14,"$")
'$$$$$$$$$linux'

rsplit

S. rsplit (sep = None, maxsplit = - 1) - > list of strings # Starts at the end of the file and splits strings by the specified delimiter

Return a list of the words in S, using sep as the
delimiter string, starting at the end of the string and
working to the front. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified, any whitespace string
is a separator.

>>> str1="hello/world/people"
>>> str1.rsplit("/")
['hello', 'world', 'people']
>>> str2="/etc/sysconfig/selinux"
>>> str2.rsplit("/")
['', 'etc', 'sysconfig', 'selinux'] 

expandtabs

S. expandtabs (tabsize = 8) - > str # Converts a tab character to a specified space according to the given parameters, defaulting to eight spaces

Return a copy of S where all tab characters are expanded using spaces.
If tabsize is not given, a tab size of 8 characters is assumed.

>>> str1="hello\tworld"
>>> print(str1)
hello   world
>>> str1.expandtabs()
'hello   world'
>>> str1.expandtabs(8)
'hello   world'
>>> str1.expandtabs(2)
'hello world'
>>> str1.expandtabs(12)
'hello       world'

rindex

S. rindex (sub [, start [, end]) - > int # returns the index value of the substring in the string, throws an exception if it is not found

Like S.rfind() but raise ValueError when the substring is not found.

>>> str1="abcdefghijklmn"
>>> str1.rindex("abc")
0
>>> str1.rindex("efg")
4
>>> str1.rindex("g")
6
>>> str1.rindex("m")
12  

isprintable

Return True when all characters in S.isprintable () - > bool # are printable characters, otherwise return False

Return True if all characters in S are considered
printable in repr() or S is empty, False otherwise.

>>> str1="              "
>>> str2="hello world"
>>> str1.isprintable()
True
>>> str2.isprintable()
True
>>> str3="\t"
>>> print(str3)
>>> str3.isprintable()
False   

splitlines

S. splitlines ([keepends]) - > list of strings # Separate strings by line breaks and return the separated list

Return a list of the lines in S, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends
is given and true.

>>> str1="hello \e world"
>>> print(str1)
hello \e world
>>> str1.splitlines()
['hello \\e world']
>>> type(str1.splitlines())
<class 'list'>
>>> str2="hello world"
>>> str2.splitlines()
['hello world']

format_map

S. format_map (mapping) - > str formatted output string

Return a formatted version of S, using substitutions from mapping.
The substitutions are identified by braces ('{' and '}').

>>> str1="hello world {lang}"
>>> lang="python"
>>> print(str1.format_map(vars()))
hello world python
>>> str2="{os} {database} {webserver} python"
>>> os="linux"
>>> database="mysql"
>>> webserver="apache"
>>> print(str2.format_map(vars()))
linux mysql apache python

isnumeric

Returns True when all characters in the S. IsNumeric () - > bool # string are numbers, or False if not.

Return True if there are only numeric characters in S,
False otherwise.

>>> str1="123456"
>>> str2="123.456"
>>> str3="hello world"
>>> str1.isnumeric()
True
>>> str2.isnumeric()
False
>>> str3.isnumeric()
False

partition

S. partition (sep) - > (head, sep, tail) # Returns a tuple separated by a given character. If a given string has more than one, the first index prevails.

Search for the separator sep in S, and return the part before it,
the separator itself, and the part after it. If the separator is not
found, return S and two empty strings.

>>> str1="hello world"
>>> str1.partition(" ")
('hello', ' ', 'world')
>>> str1.partition("l")
('he', 'l', 'lo world')
>>> str1.partition("w")
('hello ', 'w', 'orld')

rpartition

S. rpartition (sep) - > (head, sep, tail) # Returns a tuple by separating strings according to a given character

Search for the separator sep in S, starting at the end of S, and return
the part before it, the separator itself, and the part after it. If the
separator is not found, return two empty strings and S.

>>> str1="abcdefg"
>>> str1.rpartition("b")
('a', 'b', 'cdefg')
>>> str1.rpartition("d")
('abc', 'd', 'efg')
>>> str1.rpartition("g")
('abcdef', 'g', '')

Posted by timj on Thu, 31 Jan 2019 10:57:16 -0800