Safe/Allowed filename cleaner for .NET

流过昼夜 提交于 2019-11-29 07:13:35

This problem is not as simple as you may think. Not only are the characters in Path.GetInvalidFileNameChars illegal, there are several filenames, such as "PRN" and "CON", that are reserved by Windows and cannot be created. Any name that ends in "." is also illegal in Windows. Moreover, there are various length limitations. Read the full list here.

If that's not enough, different filesystems have different limitations, for example ISO 9660 filenames cannot start with "-" but can contain it.

You can use Path.GetInvalidFileNameChars to check out which characters of the string are invalid, and either convert them to a valid char such as a hyphen, or (if you need bidirectional conversion) substitute them by a escape token such as %, followed the hexadecimal representation of their unicode codes (I have actually used this technique once but don't have the code at hand right now).

EDIT: Just in case someone is interested, here is the code.

/// <summary>
/// Escapes an object name so that it is a valid filename.
/// </summary>
/// <param name="fileName">Original object name.</param>
/// <returns>Escaped name.</returns>
/// <remarks>
/// All characters that are not valid for a filename, plus "%" and ".", are converted into "%uuuu", where uuuu is the hexadecimal
/// unicode representation of the character.
/// </remarks>
private string EscapeFilename(string fileName)
{
    char[] invalidChars=Path.GetInvalidFileNameChars();

    // Replace "%", then replace all other characters, then replace "."

    fileName=fileName.Replace("%", "%0025");
    foreach(char invalidChar in invalidChars)
    {
        fileName=fileName.Replace(invalidChar.ToString(), string.Format("%{0,4:X}", Convert.ToInt16(invalidChar)).Replace(' ', '0'));
    }
    return fileName.Replace(".", "%002E");
}

/// <summary>
/// Unescapes an escaped file name so that the original object name is obtained.
/// </summary>
/// <param name="escapedName">Escaped object name (see the EscapeFilename method).</param>
/// <returns>Unescaped (original) object name.</returns>
public string UnescapeFilename(string escapedName)
{
    //We need to temporarily replace %0025 with %! to prevent a name
    //originally containing escaped sequences to be unescaped incorrectly
    //(for example: ".%002E" once escaped is "%002E%0025002E".
    //If we don't do this temporary replace, it would be unescaped to "..")

    string unescapedName=escapedName.Replace("%0025", "%!");
    Regex regex=new Regex("%(?<esc>[0-9A-Fa-f]{4})");
    Match m=regex.Match(escapedName);
    while(m.Success)
    {
        foreach(Capture cap in m.Groups["esc"].Captures)
            unescapedName=unescapedName.Replace("%"+cap.Value, Convert.ToChar(int.Parse(cap.Value, NumberStyles.HexNumber)).ToString());
        m=m.NextMatch();
    }
    return unescapedName.Replace("%!", "%");
}

Can you provide more detail on what you mean by "generate from an arbitrary string"? Based on what your saying, it sounds like you're asking

Is there any way to take an arbitrary string and mangle it in such a way that it represents a valid file name?

If that's the case then no there is not a standard function available that I am aware of. However you could use the following which should do the trick

public static string MakeValidFileName(string name) {
  var invalid = Path.GetInvalidFileNameChars();
  var builder = new StringBuilder();
  foreach ( var cur in name ) {
    builder.Append(invalid.Contains(cur) ? '_' : cur);
  }
  return builder.ToString();
}

Have you had a look at Path.GetInvalidFileNameChars?

Found at Really Useful .NET Classes Part 1 - System.IO.Path

Just for the fun of it, I did it in one line..

Regex.Replace("http://codereview.stackexchange.com/questions/33851/how-can-i-improve-my-code/33857#33857", "[" + string.Join("", Path.GetInvalidFileNameChars().Select (p => p.ToString())) + "]", "_")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!