Remove specific characters from strings in Python

Keywords: Python ascii

I'm trying to use Python to remove specific characters from a string. This is the code I'm using right now. Unfortunately, it doesn't seem to have any effect on strings.

for char in line:
    if char in " ?.!/;:":
        line.replace(char,'')

How do I do this correctly?

#1 building

For the opposite requirement that only certain characters are allowed in a string, you can set the regular expression to the complement operator [^ ABCabc] you set. For example, to delete everything except ASCII letters, numbers, and hyphens:

>>> import string
>>> import re
>>>
>>> phrase = '  There were "nine" (9) chick-peas in my pocket!!!      '
>>> allow = string.letters + string.digits + '-'
>>> re.sub('[^%s]' % allow, '', phrase)

'Therewerenine9chick-peasinmypocket'

from In python regular expression documents :

You can match characters that are not in range by supplementing the set. If the first character of the set is' ^ ', all characters that are not in the set will be matched. For example, [^ 5] will match any character other than '5', and [^ ^] will match any character other than '^'. ^If it is not the first character in the set, it has no special meaning.

#2 building

#!/usr/bin/python
import re

strs = "how^ much for{} the maple syrup? $20.99? That's[] ricidulous!!!"
print strs
nstr = re.sub(r'[?|$|.|!|a|b]',r' ',strs)#i have taken special character to remove but any #character can be added here
print nstr
nestr = re.sub(r'[^a-zA-Z0-9 ]',r'',nstr)#for removing special character
print nestr

#3 building

How's this?

def text_cleanup(text):
    new = ""
    for i in text:
        if i not in " ?.!/;:":
            new += i
    return new

#4 building

Here's a... Concept of not using regular expressions

ipstring ="text with symbols!@#$^&*( ends here"
opstring=''
for i in ipstring:
    if i.isalnum()==1 or i==' ':
        opstring+=i
    pass
print opstring

#5 building

You can also use a function to replace other kinds of regular expressions or other patterns with lists. This allows you to mix regular expressions, character classes, and real basic text patterns. It's useful when you need to replace many HTML elements.

*Note: for Python 3.x

import re  # Regular expression library


def string_cleanup(x, notwanted):
    for item in notwanted:
        x = re.sub(item, '', x)
    return x

line = "<title>My example: <strong>A text %very% $clean!!</strong></title>"
print("Uncleaned: ", line)

# Get rid of html elements
html_elements = ["<title>", "</title>", "<strong>", "</strong>"]
line = string_cleanup(line, html_elements)
print("1st clean: ", line)

# Get rid of special characters
special_chars = ["[!@#$]", "%"]
line = string_cleanup(line, special_chars)
print("2nd clean: ", line)

In the function string Ou cleanup, it takes the string x and an unneeded list as parameters. For each item in the element or schema list, if an override is required, it completes.

Output:

Uncleaned:  <title>My example: <strong>A text %very% $clean!!</strong></title>
1st clean:  My example: A text %very% $clean!!
2nd clean:  My example: A text very clean

Posted by SeanStar on Fri, 17 Jan 2020 07:25:44 -0800

Programmer Group