Use batch file to remove special characters from csv file

不羁岁月 提交于 2021-01-29 07:21:48

问题


I have a csv file with 18 fields. I have written a batch file to manipulate the data. Everything works except removing the comma from the "DEVILS DUE /1FIRST COMICS, LLC" publisher. That field does not parse correctly. I have tried looking at other batch files for examples, but I am not familiar with the snytax.

@echo off & Setlocal EnableDelayedExpansion
( FOR /f "tokens=1-18 delims=," %%A in ('More +4 datatest.csv') do (
rem H is the department code
rem S is the sales tax code
rem Q is the publisher code
    Set "H=%%H"
    Set "S=T"
    Set "Q=%%Q"
    if "%%Q"=="BOOM! STUDIOS" Set "Q=BOOM STUDIOS"
    if "%%Q"=="DEVILS DUE /1FIRST COMICS, LLC" Set "Q=DEVILS DUE"
    if "%%H"=="1" Set "H=1005" 
    if "%%H"=="1" Set "S=N"
    if "%%H"=="2" Set "H=1009" 
    if "%%H"=="2" Set "S=N"
    if "%%H"=="3" Set "H=1008"
    if "%%H"=="4" Set "H=1002"
    if "%%H"=="5" Set "H=1006"
    if "%%H"=="6" Set "H=1003"
    if "%%H"=="7" Set "H=1011"
    if "%%H"=="8" Set "H=1011"
    if "%%H"=="9" Set "H=1004"
    if "%%H"=="10" Set "H=1016"
    if "%%H"=="11" Set "H=1015"
    if "%%H"=="12" Set "H=1015"
    if "%%H"=="13" Set "H=1011"
    if "%%H"=="14" Set "H=1009" 
    if "%%H"=="14" Set "S=N"
    if "%%H"=="15" Set "H=1013"  
    if "%%H"=="16" Set "H=1017"
    echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,%%Q,"","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
  )
)>paygoinvoice.csv
@echo on

回答1:


The issue you're having is that the delims=, you have in your FOR loop is causing to change DEVILS DUE /1FIRST COMICS, LLC or ANY comma's in your code as a space.

Combine this with Tokens= then %%H = DEVILS DUE /1FIRST COMICS ---And--- %%I = LLC.

A quick and dirty fix (That I know of) is to simply change all ", " to something different before running it into the main function. For my example I used 1Comma1. This will change your IF search to DEVILS DUE /1FIRST COMICS1Comma1 LLC.

Fixed.Bat:

@echo off & Setlocal EnableDelayedExpansion

Rem | Replace all ", " with "1Comma1"
for /f "tokens=1,* delims=¶" %%A in ('"type datatest.csv"') do (
    SET string=%%A
    setlocal EnableDelayedExpansion
    SET modified=!string:, =1Comma1 !

    >> datatest.csv.TEMP echo(!modified!
    endlocal
)

Rem | Main .CSV Edit Function
( FOR /f "tokens=1-8* delims=," %%A in ('More +4 datatest.csv.TEMP') do (
    Set "ItemData=%%H"
    if "%%H"=="1" Set "ItemData=1005"
    if "%%H"=="3" Set "ItemData=1008"
    if "%%H"=="BOOM STUDIOS" Set "ItemData=NEW STUDIOS"
    if "%%H"=="DEVILS DUE /1FIRST COMICS1Comma1 LLC" Set "ItemData=DEVILS DUE"

    echo %%A,%%B,%%C,%%D,%%E,%%F,%%G,!ItemData!,%%I
  )
)>paygoinvoice.txt
del datatest.csv.TEMP

@echo on

PS: The code I used for your example above was taken from your last post about the subject. Simply add your new code where it belongs.

Also keep in mind that EnableDelayedExpansion auto removes ! from the output of an FOR loop or IF statement.




回答2:


As you never showed a real world sample input file, it is difficult to help.

The problem with for /f parsing csv files is :

  1. that it does NOT obey double quoted fields and will also tokenize included commas,
  2. treats adjacent delimiters as only one, and ignores leading ones.

So the 1st issue applies, the 2nd is unknwon.

One work around is to not parse the field(s) in question and pass them quoted as arguments to a sub which does obey the quotes, and process them there.

To ease handling of array values there exits a technique to expand lists to an array, see it implemented in the following batch for DepCode and STaxCode (as I hinted in my answer to your previous question):

@echo off & Setlocal EnableDelayedExpansion

:: Build array DepCode[1..16]
Set i=0&Set "DepCode=,1005,1009,1008,1002,1006,1003,1011,1011,1004,1016,1015,1015,1011,1009,1013,1017"
Set "DepCode=%DepCode:,="&Set /a i+=1&Set "DepCode[!i!]=%"
:: Set DepCode

:: Build array STaxCode[1..16]
Set i=0&Set "STaxCode=,N,N,S,S,S,S,S,S,S,S,S,S,S,N,S,S"
Set "STaxCode=%STaxCode:,="&Set /a i+=1&Set "STaxCode[!i!]=%"
:: Set STaxCode

( FOR /f "tokens=1-16* delims=," %%A in ('More +4 SO_53917950.csv') do (
    rem H is the department code
    Set "H=!DepCode[%%~H]!"
    rem S is the sales tax code
    Set "S=!STaxCode[%%~H]!"
    rem Q is the publisher code 17th field and 18th field 
    Call :RemoveComma %%Q 

rem echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,"!PubCode!","","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
    echo "%%~A","%%~B","%%~C","%%~D","%%~E","%%~F","%%~G","!H!","%%~I","%%~J","%%~K","%%~L","%%~M","%%~N","%%~O","%%~P","!PubCode!","!R!","!S!"

  )
)>paygoinvoice.csv
Goto :Eof

:RemoveComma
Set "R=%~2"
:: remove comma from field
::Set "PubCode=%PubCode:,= %"

:: split field at first comma or slash/backslash
for /f "delims=,/\" %%a in (%1) do Set "PubCode=%%a" 

This constructed input file SO_.csv:

first  line to remove
second line to remove
third  line to remove
fourth line to remove
"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","HeadH","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","2","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","3","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","4","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","5","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","6","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","7","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","8","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","9","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","10","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","11","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","12","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","13","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","14","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","15","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","16","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"

Will have this output (with remarkably slower processing due to the additional call):

"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR",""
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1005","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1008","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1002","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1006","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1003","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1004","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1016","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1013","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1017","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"


来源:https://stackoverflow.com/questions/53917950/use-batch-file-to-remove-special-characters-from-csv-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!