问题
I have some perl code which deletes folders using function File::Path::rmtree. This function works successfully if the folder structure contains ascii character files/folders but fails if the folder contains Unicode character files/folders.. Perl version I am using is "This is perl 5, version 12, subversion 4 (v5.12.4) built for MSWin32-x86-multi-thread"
I have also tried using the latest perl version., but the issue persists. Here is sample code:
use strict 'vars';
require File::Path;
sub Rmdir($)
{
my ($Arena) = "D:\\tmp\\TestUnicodeRm";
if (-d $Arena){
print "Dir to Rmtree $Arena\n";
File::Path::rmtree($Arena,0,0);
}
if (-d $Arena){
print "Failed to clean up test area $Arena.\n";
}
}
Rmdir $0;
1;
If the directory 'D:\tmp\TestUnicodeRm' has file with name say 'chinese_trad_我的文件.txt' then I get error as "cannot remove directory for XXX: Directory not empty at D:\tmp\rmtree.pm line XX".
Thanks in advance!
回答1:
You can use the subs provided by Win32::Unicode::File and Win32::Unicode::Dir to do what you want.
Windows provides two versions of each API call that accepts or returns text.
The versions with the "A" (ANSI) suffix expect and return text encoded using the system's Active Code Page. (
"cp".Win32::GetACP()
provides an encoding name you can use with the subs provided by Encode.)For example, the
DeleteFileA
system call is used to delete a file, and it expects a path encoded using the system's Active Code Page.The versions with the "W" (Wide) suffix expect and return text encoded using UTF-16le.
For example, the
DeleteFileW
system call is used to delete a file, and it expects a path encoded using UTF-16le.
Perl uses the "A" version of all system calls. The "W" version is required here.
The modules mentioned above provide access to the "W" version of calls you need.
回答2:
Filenames are always bytes. Unfortunately there is no indication or requirement for unicode characters in filenames to be represented in a certain encoding, and every OS has different conventions. In most Unix-like systems the filenames are encoded to UTF-8 and interacted with as bytes. However in Windows the filenames are stored as UTF-16, but interacted with as decoded characters. It sounds like a bug in File::Path that it doesn't properly deal with these filenames as it finds them - as you are not providing the filenames, it can't be a bug in your code.
I would first suggest making sure your File::Path is the latest version (2.16). If this doesn't work, all I can suggest is to report a bug, and either manually recursively use opendir and readdir to remove files and subdirectories, or shell out to rd /s.
my $rc = system 'rd', '/s', $dir; # check for errors as in system() docs
来源:https://stackoverflow.com/questions/55616658/how-to-delete-folder-with-unicode-character-filenames-using-perl-rmtree