Python string.replace() not replacing characters

佐手、 提交于 2019-11-29 17:33:36

问题


Some background information: We have an ancient web-based document database system where I work, almost entirely consisting of MS Office documents with the "normal" extensions (.doc, .xls, .ppt). They are all named based on some sort of arbitrary ID number (i.e. 1245.doc). We're switching to SharePoint and I need to rename all of these files and sort them into folders. I have a CSV file with all sorts of information (like which ID number corresponds to which document's title), so I'm using it to rename these files. I've written a short Python script that renames the ID number title.

However, some of the titles of the documents have slashes and other possibly bad characters to have in a title of a file, so I want to replace them with underscores:

bad_characters = ["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]
for letter in bad_characters:
    filename = line[2].replace(letter, "_")
    foldername = line[5].replace(letter, "_")
  • Example of line[2]: "Blah blah boring - meeting 2/19/2008.doc"
  • Example of line[5]: "Business meetings 2/2008"

When I add print letter inside of the for loop, it will print out the letter it's supposed to be replacing, but won't actually replace that character with an underscore like I want it to.

Is there anything I'm doing wrong here?


回答1:


That's because filename and foldername get thrown away with each iteration of the loop. The .replace() method returns a string, but you're not saving the result anywhere.

You should use:

filename = line[2]
foldername = line[5]

for letter in bad_characters:
    filename = filename.replace(letter, "_")
    foldername = foldername.replace(letter, "_")

But I would do it using regex. It's cleaner and (likely) faster:

p = re.compile('[/:()<>|?*]|(\\\)')
filename = p.sub('_', line[2])
folder = p.sub('_', line[5])



回答2:


You are reassigning to the filename and foldername variables at every iteration of the loop. In effect, only * is being replaced.




回答3:


You should look at the python string method translate() http://docs.python.org/library/string.html#string.translate with http://docs.python.org/library/string.html#string.maketrans

Editing this to add an example as per comment suggestion below:
import string
toreplace=''.join(["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]) 
underscore=''.join( ['_'] * len(toreplace))
transtable = string.maketrans(toreplace,underscore)
filename = filename.translate(transtable)
foldername = foldername.translate(transtable)

Can simplify by making the toreplace something like '/\:,' etc, i just used what was given above




回答4:


You are starting over with the base line instead of saving the replaced result, thus you are getting the equivalent to

filename = line[2].replace('*', '_')
foldername = line[5].replace('*', '_')

Try the following

bad_characters = ["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]
filename = line[2]
foldername = line[5]
for letter in bad_characters:
    filename = filename.replace(letter, "_")
    foldername = foldername.replace(letter, "_")



回答5:


Should use string.replace(str, fromStr, toStr)

bad_characters = ["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]
for letter in bad_characters:
    filename = string.replace(line[2], letter, "_")
    foldername = string.replace(line[5], letter, "_")


来源:https://stackoverflow.com/questions/3523054/python-string-replace-not-replacing-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!