I have a multiple CSV files each with a different amount of entries each with roughly 300 lines each.
The first line in each file is the Data labels
A good CSV parser will have no trouble with this since commas are inside the quoted fields, so you can simply parse the file with it.
A really nice module is Text::CSV_XS, which is loaded by default when you use the wrapper Text::CSV. The only thing to address in your data is the spaces between fields since they aren't in CSV specs, so I use the option for that in the example below.
If you indeed must remove commas for further work do that as the parser hands you lines.
use warnings;
use strict;
use feature 'say';
use Text::CSV;
my $file = 'commas_in_fields.csv';
my $csv = Text::CSV->new( { binary => 1, allow_whitespace => 1 } )
or die "Cannot use CSV: " . Text::CSV->error_diag ();
open my $fh, '<', $file or die "Can't open $file: $!";
my @headers = @{ $csv->getline($fh) }; # if there is a separate header line
while (my $line = $csv->getline($fh)) { # returns arrayref
tr/,//d for @$line; # delete commas from each field
say "@$line";
}
This uses tr
on $_
in the for
loop, changing the elements of the array, for conciseness.
I'd like to repeat and emphasize what others have explained: do not parse CSV by hand, since only trouble awaits; use a library. This is very much akin to parsing XML and similar formats: no regex please, but libraries.