How to remove illegal characters from path and filenames?

前端 未结 29 3474
离开以前
离开以前 2020-11-22 17:18

I need a robust and simple way to remove illegal path and file characters from a simple string. I\'ve used the below code but it doesn\'t seem to do anything, what am I miss

29条回答
  •  梦谈多话
    2020-11-22 17:38

    Scanning over the answers here, they all** seem to involve using a char array of invalid filename characters.

    Granted, this may be micro-optimising - but for the benefit of anyone who might be looking to check a large number of values for being valid filenames, it's worth noting that building a hashset of invalid chars will bring about notably better performance.

    I have been very surprised (shocked) in the past just how quickly a hashset (or dictionary) outperforms iterating over a list. With strings, it's a ridiculously low number (about 5-7 items from memory). With most other simple data (object references, numbers etc) the magic crossover seems to be around 20 items.

    There are 40 invalid characters in the Path.InvalidFileNameChars "list". Did a search today and there's quite a good benchmark here on StackOverflow that shows the hashset will take a little over half the time of an array/list for 40 items: https://stackoverflow.com/a/10762995/949129

    Here's the helper class I use for sanitising paths. I forget now why I had the fancy replacement option in it, but it's there as a cute bonus.

    Additional bonus method "IsValidLocalPath" too :)

    (** those which don't use regular expressions)

    public static class PathExtensions
    {
        private static HashSet _invalidFilenameChars;
        private static HashSet InvalidFilenameChars
        {
            get { return _invalidFilenameChars ?? (_invalidFilenameChars = new HashSet(Path.GetInvalidFileNameChars())); }
        }
    
    
        /// Replaces characters in text that are not allowed in file names with the 
        /// specified replacement character.
        /// Text to make into a valid filename. The same string is returned if 
        /// it is valid already.
        /// Replacement character, or NULL to remove bad characters.
        /// TRUE to replace quotes and slashes with the non-ASCII characters ” and ⁄.
        /// A string that can be used as a filename. If the output string would otherwise be empty, "_" is returned.
        public static string ToValidFilename(this string text, char? replacement = '_', bool fancyReplacements = false)
        {
            StringBuilder sb = new StringBuilder(text.Length);
            HashSet invalids = InvalidFilenameChars;
            bool changed = false;
    
            for (int i = 0; i < text.Length; i++)
            {
                char c = text[i];
                if (invalids.Contains(c))
                {
                    changed = true;
                    char repl = replacement ?? '\0';
                    if (fancyReplacements)
                    {
                        if (c == '"') repl = '”'; // U+201D right double quotation mark
                        else if (c == '\'') repl = '’'; // U+2019 right single quotation mark
                        else if (c == '/') repl = '⁄'; // U+2044 fraction slash
                    }
                    if (repl != '\0')
                        sb.Append(repl);
                }
                else
                    sb.Append(c);
            }
    
            if (sb.Length == 0)
                return "_";
    
            return changed ? sb.ToString() : text;
        }
    
    
        /// 
        /// Returns TRUE if the specified path is a valid, local filesystem path.
        /// 
        /// 
        /// 
        public static bool IsValidLocalPath(this string pathString)
        {
            // From solution at https://stackoverflow.com/a/11636052/949129
            Uri pathUri;
            Boolean isValidUri = Uri.TryCreate(pathString, UriKind.Absolute, out pathUri);
            return isValidUri && pathUri != null && pathUri.IsLoopback;
        }
    }
    

提交回复
热议问题