What is a good/best way to count the number of characters, words, and lines of a text file using Perl (without using wc)?
I stumbled upon this while googling for a character count solution. Admittedly, I know next to nothing about perl so some of this may be off base, but here are my tweaks of newt's solution.
First, there is a built-in line count variable anyway, so I just used that. This is probably a bit more efficient, I guess. As it is, the character count includes newline characters, which is probably not what you want, so I chomped $_. Perl also complained about the way the split() is done (implicit split, see: Why does Perl complain "Use of implicit split to @_ is deprecated"? ) so I tweaked that. My input files are UTF-8 so I opened them as such. That probably helps get the correct character count in the input file contains non-ASCII characters.
Here's the code:
open(FILE, "<:encoding(UTF-8)", "file.txt") or die "Could not open file: $!";
my ($lines, $words, $chars) = (0,0,0);
my @wordcounter;
while () {
chomp($_);
$chars += length($_);
@wordcounter = split(/\W+/, $_);
$words += @wordcounter;
}
$lines = $.;
close FILE;
print "\nlines=$lines, words=$words, chars=$chars\n";