How to delete folder with unicode character filenames using Perl rmtree?

放肆的年华 提交于 2021-02-10 07:53:07

问题


I have some perl code which deletes folders using function File::Path::rmtree. This function works successfully if the folder structure contains ascii character files/folders but fails if the folder contains Unicode character files/folders.. Perl version I am using is "This is perl 5, version 12, subversion 4 (v5.12.4) built for MSWin32-x86-multi-thread"

I have also tried using the latest perl version., but the issue persists. Here is sample code:

use strict 'vars';
require File::Path;

sub Rmdir($)
{
   my ($Arena) = "D:\\tmp\\TestUnicodeRm";

   if (-d $Arena){
   print "Dir to Rmtree $Arena\n";
       File::Path::rmtree($Arena,0,0);
}

     if (-d $Arena){
        print "Failed to clean up test area $Arena.\n";
     }
}

Rmdir $0;

1;

If the directory 'D:\tmp\TestUnicodeRm' has file with name say 'chinese_trad_我的文件.txt' then I get error as "cannot remove directory for XXX: Directory not empty at D:\tmp\rmtree.pm line XX".

Thanks in advance!


回答1:


You can use the subs provided by Win32::Unicode::File and Win32::Unicode::Dir to do what you want.


Windows provides two versions of each API call that accepts or returns text.

  • The versions with the "A" (ANSI) suffix expect and return text encoded using the system's Active Code Page. ("cp".Win32::GetACP() provides an encoding name you can use with the subs provided by Encode.)

    For example, the DeleteFileA system call is used to delete a file, and it expects a path encoded using the system's Active Code Page.

  • The versions with the "W" (Wide) suffix expect and return text encoded using UTF-16le.

    For example, the DeleteFileW system call is used to delete a file, and it expects a path encoded using UTF-16le.

Perl uses the "A" version of all system calls. The "W" version is required here.

The modules mentioned above provide access to the "W" version of calls you need.




回答2:


Filenames are always bytes. Unfortunately there is no indication or requirement for unicode characters in filenames to be represented in a certain encoding, and every OS has different conventions. In most Unix-like systems the filenames are encoded to UTF-8 and interacted with as bytes. However in Windows the filenames are stored as UTF-16, but interacted with as decoded characters. It sounds like a bug in File::Path that it doesn't properly deal with these filenames as it finds them - as you are not providing the filenames, it can't be a bug in your code.

I would first suggest making sure your File::Path is the latest version (2.16). If this doesn't work, all I can suggest is to report a bug, and either manually recursively use opendir and readdir to remove files and subdirectories, or shell out to rd /s.

my $rc = system 'rd', '/s', $dir; # check for errors as in system() docs


来源:https://stackoverflow.com/questions/55616658/how-to-delete-folder-with-unicode-character-filenames-using-perl-rmtree

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!