Format of CSV not correct?

徘徊边缘 提交于 2019-12-12 11:42:51

问题


I am generating a CSV with EXPORT-CSV in Powershell and then feeding it to a Perl script. But Perl is unable to import the file.

I have verified the CSV-file against a working version (that has been exported from the same Perl-script and not powershell) and there are NO difference. The coloumns are excactly the same and they both have semicolon as delimiter. If I open the file in Excel however everything ends up in the first cell on each line (meaning I have to do a text-to-coloumns). The working file ends up in a different cells from the start..

To add to the confusion: when I open the file in notepad and copy/paste the contents to a new file the import works!

So, what am I missing? Are there "hidden" properties that I cannot spot with Notepad? Do I have to change the encoding-type?

Please help:)


回答1:


To get a better look at your CSV files try using Notepad++. This will tell you the file encoding in the status bar. Also turn on hidden characters (View > Show Symbol > Show All Characters). This will reveal if there are just line feeds, or carriage returns + line feeds, tabs vs spaces etc... You can also change the file encoding from the Encoding menu. This may help you identify the differences. Notepad doesn't display any of this information.

Update - Here's how to convert a text file from Windows to Unix format in code:

$allText = [IO.File]::ReadAllText("C:\test.csv") -replace "`r`n?", "`n" 
$encoding = New-Object System.Text.ASCIIEncoding    
[IO.File]::WriteAllText("C:\test2.csv", $allText, $encoding)

Or you can use Notepad++ (Edit > EOL Conversion > Unix Format).




回答2:


It could be a encoding issue when you are using export-csv

The default is ASCII, which should be fine usually, but try setting -Encoding UTF8 in the Export-CSV command.




回答3:


From CPAN Text::CSV:

use Text::CSV;

my @rows;
my $csv = Text::CSV->new ( { binary => 1 } )  # should set binary attribute.
             or die "Cannot use CSV: ".Text::CSV->error_diag();

open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
while ( my $row = $csv->getline( $fh ) ) {
  $row->[2] =~ m/pattern/ or next; # 3rd field should match
  push @rows, $row;
}
$csv->eof or $csv->error_diag();
close $fh;

Never try to parse CSV yourself, it seems easy at first glance but has a lot of deep pits to fall into.




回答4:


Excel tends to assume that files saved in the .csv format are indeed comma-delimited. However, it seems you are using semicolons. You can try switching to commas, or if that is not an option, try changing the extension to .txt. Excel should automatically recognize it if you do the former, whereas the latter will take you through the import wizard upon loading the file.




回答5:


Given what has been discovered through the other posts, I think your best bet is to:

  1. Convert to a CSV string (which uses unix-y carriage returns rather than Windows)
  2. Send that to a file, ensuring the encoding is not ASCII.

$str = $object | convertto-csv -notypeinformation | foreach-object { $_ -replace "`"","" } #

foreach-object is a hack to remove the extra quotes that convertto-csv adds. If your data may have double-quotes, you'll need to look at alternatives.

$str | out-file -filepath "path\to\newcsv" -encoding UTF8


来源:https://stackoverflow.com/questions/8957629/format-of-csv-not-correct

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!