I'm trying to use Python to remove specific characters from a string. This is the code I'm using right now. Unfortunately, it doesn't seem to have any effect on strings.
for char in line: if char in " ?.!/;:": line.replace(char,'')
How do I do this correctly?
#1 building
For the opposite requirement that only certain characters are allowed in a string, you can set the regular expression to the complement operator [^ ABCabc] you set. For example, to delete everything except ASCII letters, numbers, and hyphens:
>>> import string >>> import re >>> >>> phrase = ' There were "nine" (9) chick-peas in my pocket!!! ' >>> allow = string.letters + string.digits + '-' >>> re.sub('[^%s]' % allow, '', phrase) 'Therewerenine9chick-peasinmypocket'
from In python regular expression documents :
You can match characters that are not in range by supplementing the set. If the first character of the set is' ^ ', all characters that are not in the set will be matched. For example, [^ 5] will match any character other than '5', and [^ ^] will match any character other than '^'. ^If it is not the first character in the set, it has no special meaning.
#2 building
#!/usr/bin/python import re strs = "how^ much for{} the maple syrup? $20.99? That's[] ricidulous!!!" print strs nstr = re.sub(r'[?|$|.|!|a|b]',r' ',strs)#i have taken special character to remove but any #character can be added here print nstr nestr = re.sub(r'[^a-zA-Z0-9 ]',r'',nstr)#for removing special character print nestr
#3 building
How's this?
def text_cleanup(text): new = "" for i in text: if i not in " ?.!/;:": new += i return new
#4 building
Here's a... Concept of not using regular expressions
ipstring ="text with symbols!@#$^&*( ends here" opstring='' for i in ipstring: if i.isalnum()==1 or i==' ': opstring+=i pass print opstring
#5 building
You can also use a function to replace other kinds of regular expressions or other patterns with lists. This allows you to mix regular expressions, character classes, and real basic text patterns. It's useful when you need to replace many HTML elements.
*Note: for Python 3.x
import re # Regular expression library def string_cleanup(x, notwanted): for item in notwanted: x = re.sub(item, '', x) return x line = "<title>My example: <strong>A text %very% $clean!!</strong></title>" print("Uncleaned: ", line) # Get rid of html elements html_elements = ["<title>", "</title>", "<strong>", "</strong>"] line = string_cleanup(line, html_elements) print("1st clean: ", line) # Get rid of special characters special_chars = ["[!@#$]", "%"] line = string_cleanup(line, special_chars) print("2nd clean: ", line)
In the function string Ou cleanup, it takes the string x and an unneeded list as parameters. For each item in the element or schema list, if an override is required, it completes.
Output:
Uncleaned: <title>My example: <strong>A text %very% $clean!!</strong></title> 1st clean: My example: A text %very% $clean!! 2nd clean: My example: A text very clean