How do I handle/store multiple lines into a single field read from a file in perl?

雨燕双飞 提交于 2019-12-10 19:29:19

问题


I am trying to process a text file in perl. I need to store the data from the file into a database. The problem that I'm having is that some fields contain a newline, which throws me off a bit. What would be the best way to contain these fields?

Example data.txt file:

ID|Title|Description|Date
1|Example 1|Example Description|10/11/2011
2|Example 2|A long example description
Which contains
a bunch of newlines|10/12/2011
3|Example 3|Short description|10/13/2011

The current (broken) Perl script (example):

#!/usr/bin/perl -w
use strict;

open (MYFILE, 'data.txt');
while (<MYFILE>) {
    chomp;
    my ($id, $title, $description, $date) = split(/\|/);

    if ($id ne 'ID') {
        # processing certain fields (...)

        # insert into the database (example)
        $sqlInsert->execute($id, $title, $description, $date);
    }
}
close (MYFILE);

As you can see from the example, in the case of ID 2, it's broken into several lines causing errors when attempting to reference those undefined variables. How would you group them into the correct field?

Thanks in advance! (I hope the question was clear enough, difficult to define the title)


回答1:


I would just count the number of separators before splitting the line. If you don't have enough, read the next line and append it. The tr operator is an efficient way to count characters.

#!/usr/bin/perl -w
use strict;
use warnings;

open (MYFILE, '<', 'data.txt');
while (<MYFILE>) {
    # Continue reading while line incomplete:
    while (tr/|// < 3) {
        my $next = <MYFILE>;
        die "Incomplete line at end" unless defined $next;
        $_ .= $next;
    }

    # Remaining code unchanged:
    chomp;
    my ($id, $title, $description, $date) = split(/\|/);

    if ($id ne 'ID') {
        # processing certain fields (...)

        # insert into the database (example)
        $sqlInsert->execute($id, $title, $description, $date);
    }
}
close (MYFILE);



回答2:


Read next line until number of fields is what you need. Something like that (I haven't tested that code):

my @fields = split(/\|/);
unless ($#fields == 3) { # Repeat untill we get 4 fields in array

  <MYFILE>; # Read next line      
  chomp;

  # Split line
  my @add_fields = split(/\|/); 

  # Concatenate last element of first line with first element of the current line
  $fields[$#fields] = $fields[$#fields] . $add_fields[0]; 

  # Concatenate remaining array part
  push(@fields, @add_fields[1,$#add_fields]);

}



回答3:


If you could change your data.txt file to include the pipe separator as the last character in every line/record, you could slurp in the whole file, splitting directly into the raw fields. This code would then do what you want:

#!/usr/bin/perl
use strict;
use warnings;

my @fields;
{
  $/ = "|";
  open (MYFILE, 'C:/data.txt') or die "$!";
  @fields = <MYFILE>;
  close (MYFILE);

  for(my $i = 0; $i < scalar(@fields); $i = $i + 4) {
    my $id = $fields[$i];
    my $title = $fields[$i+1];
    my $description = $fields[$i+2];
    my $date = $fields[$i+3];
    if ($id =~ m/^\d+$/) {
        # processing certain fields (...)

        # insert into the database (example)
    }
  }
}


来源:https://stackoverflow.com/questions/6075327/how-do-i-handle-store-multiple-lines-into-a-single-field-read-from-a-file-in-per

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!