Reliable Perl encoding with File::Slurp

你说的曾经没有我的故事 提交于 2019-12-12 01:54:34

问题


I need to replace every occurrence of http:// with // in a file. The file may be (at least) in UTF-8, CP1251, or CP1255.

Does the following work?

use File::Slurp;
my $Text = read_file($File, binmode=>':raw');
$Text =~ s{http://}{//}gi;
write_file($File, {atomic=>1, binmode=>':raw'}, $Text);

It seems correct, but I need to be sure that the file will not be damaged whatever encoding it has. Please help me to be sure.


回答1:


This answer won't make you sure, though I hope it can help.

I don't see any problem with your script (tested with utf8 ans iso-8859-1 without problems) though there seems to be a discussion regarding the capacity of File::slurp to correctly handle encoding : http://blogs.perl.org/users/leon_timmermans/2015/08/fileslurp-is-broken-and-wrong.html

In this answer on a similar subject, the author recommends File::Slurper as an alternative, due to better encoding handling: https://stackoverflow.com/a/206682/6193608




回答2:


It's no longer recommended to use File::Slurp (see here).

I would recommend using Path::Tiny. It's easy to use, works with both files and directories, only uses core modules, and has slurp/spew methods specifically for uft8 and raw so you shouldn't have a problem with the encoding.

Usage:

use Path::Tiny;

my $Text = path($File)->slurp_raw;

$Text =~ s{http://}{//}gi;

path($File)->spew_raw($Text);

Update: From documentation on spew:

Writes data to a file atomically. The file is written to a temporary file in the same directory, then renamed over the original. An optional hash reference may be used to pass options. The only option is binmode, which is passed to binmode() on the handle used for writing.

spew_raw is like spew with a binmode of :unix for a fast, unbuffered, raw write.

spew_utf8 is like spew with a binmode of :unix:encoding(UTF-8) (or PerlIO::utf8_strict). If Unicode::UTF8 0.58+ is installed, a raw spew will be done instead on the data encoded with Unicode::UTF8.



来源:https://stackoverflow.com/questions/41037415/reliable-perl-encoding-with-fileslurp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!