How can I parse quoted CSV in Perl with a regex?

前端 未结 7 1449
青春惊慌失措
青春惊慌失措 2020-11-30 09:10

I\'m having some issues with parsing CSV data with quotes. My main problem is with quotes within a field. In the following example lines 1 - 4 work correctly but 5,6 and 7 d

7条回答
  •  [愿得一人]
    2020-11-30 09:50

    You can parse CSV using Text::ParseWords which ships with Perl.

    use Text::ParseWords;
    
    while () {
        chomp;
        my @f = quotewords ',', 0, $_;
        say join ":" => @f;
    }
    
    __DATA__
    COLLOQ_TYPE,COLLOQ_NAME,COLLOQ_CODE,XDATA
    S,"BELT,FAN",003541547,
    S,"BELT V,FAN",000324244,
    S,SHROUD SPRING SCREW,000868265,
    S,"D" REL VALVE ASSY,000771881,
    S,"YBELT,"V"",000323030,
    S,"YBELT,'V'",000322933,
    

    which parses your CSV correctly....

    # => COLLOQ_TYPE:COLLOQ_NAME:COLLOQ_CODE:XDATA
    # => S:BELT,FAN:003541547:
    # => S:BELT V,FAN:000324244:
    # => S:SHROUD SPRING SCREW:000868265:
    # => S:D REL VALVE ASSY:000771881:
    # => S:YBELT,V:000323030:
    # => S:YBELT,'V':000322933:
    

    The only issue I've had with Text::ParseWords is when nested quotes in data aren't escaped correctly. However this is badly built CSV data and would cause problems with most CSV parsers ;-)

    So you may notice that

    # S,"YBELT,"V"",000323030,
    

    came out as (ie. quotes dropped around "V")

    # S:YBELT,V:000323030:
    

    however if its escaped like so

    # S,"YBELT,\"V\"",000323030,
    

    then quotes will be retained

    # S:YBELT,"V":000323030:
    

提交回复
热议问题