How can I sanitize a string for use as a filename?

后端 未结 9 827
说谎
说谎 2020-12-23 22:56

I\'ve got a routine that converts a file into a different format and saves it. The original datafiles were numbered, but my routine gives the output a filename based on an

9条回答
  •  鱼传尺愫
    2020-12-23 23:34

    Well, the easy thing is to use a regex and your favourite language's version of gsub to replace anything that's not a "word character." This character class would be "\w" in most languages with Perl-like regexes, or "[A-Za-z0-9]" as a simple option otherwise.

    Particularly, in contrast to some of the examples in other answers, you don't want to look for invalid characters to remove, but look for valid characters to keep. If you're looking for invalid characters, you're always vulnerable to the introduction of new characters, but if you're looking for only valid ones, you might be slightly less inefficient (in that you replaced a character you didn't really need to), but at least you'll never be wrong.

    Now, if you want to make the new version as much like the old as possible, you might consider replacement. Instead of deleting, you can substitute a character or characters you know to be ok. But doing that is an interesting enough problem that it's probably a good topic for another question.

提交回复
热议问题