I have been given the task to remove all non numeric characters including spaces from a either text file or string and then print the new result next to the old characters for example:
Before:
sd67637 8
After:
sd67637 8 = 676378
As i am a beginner i do not know where to start with this task. Please Help
The easiest way is with a regexp
import re
a = 'lkdfhisoe78347834 (())&/&745 '
result = re.sub('[^0-9]','', a)
print result
>>> '78347834745'
Loop over your string, char by char and only include digits:
new_string = ''.join(ch for ch in your_string if ch.isdigit())
Or use a regex on your string (if at some point you wanted to treat non-contiguous groups separately)...
import re
s = 'sd67637 8'
new_string = ''.join(re.findall(r'\d+', s))
# 676378
Then just print them out:
print(old_string, '=', new_string)
There is a builtin for this.
string.translate(s, table[, deletechars])
Delete all characters from s that are in deletechars (if present), and then translate the characters using table, which must be a 256-character string giving the translation for each character value, indexed by its ordinal. If table is None, then only the character deletion step is performed.
>>> import string
>>> non_numeric_chars = ''.join(set(string.printable) - set(string.digits))
>>> non_numeric_chars = string.printable[10:] # more effective method. (choose one)
'sd67637 8'.translate(None, non_numeric_chars)
'676378'
Or you could do it with no imports (but there is no reason for this):
>>> chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
>>> 'sd67637 8'.translate(None, chars)
'676378'
You can use string.ascii_letters to identify your non-digits:
from string import *
a = 'sd67637 8'
a = a.replace(' ', '')
for i in ascii_letters:
a = a.replace(i, '')
In case you want to replace a colon, use quotes " instead of colons '.
来源:https://stackoverflow.com/questions/17336943/removing-non-numeric-characters-from-a-string