Count number of line in txt file when new line is inside data

问题

I have one txt file which has below data

Name    mobile  url message text
test11  1234567890  www.google.com  "Data Test New
Date:27/02/2020
Items: 1
Total: 3
Regards
ABC DATa
Ph:091 : 123456789"
test12  1234567891  www.google.com  "Data Test New one
Date:17/02/2020
Items: 26
Total: 5
Regards
user test
Ph:091 : 433333333"

Now you can see my last column data has new line character. so when I use below command

awk 'END{print NR}' file.txt

it is giving my length is 15 but actually line length is 3 . Please suggest command for the same

Edited Part: As per the answer given the below script is not working if there's no newline at the end of input file

awk -v RS='"[^"]*"' '{gsub(/\n/, " ", RT); ORS=RT} END{print NR "\n"}' test.txt

Also my file may have 3-4 Million of records . So converting file to unix format will take time and that is not my preference. So Please suggest some optimum solution which should work in both case

head 5.csv | cat -A  
Above command is giving me the output

Name mobile url message text^M$

回答1:

Using gnu-awk you can do this using a custom RS:

awk -v RS='"[^"]*"' '{gsub(/(\r?\n){2,}/, "\n"); n+=gsub(/\n/, "&")}
END {print n}' <(sed '$s/$//' file)

15001

Here:

-v RS='"[^"]*"': Uses this regex as input record separator. Which matches a double quoted string
n+=gsub(/\n/, "&"): Dummy replace \n with itself and counts \n in variable n
END {print n}: Prints n in the end
sed '$s/$//' file: For last line adds a newline (in case it is missing)

Code Demo

回答2:

With perl, assuming last line always ends with a newline character

$ perl -0777 -nE 'say s/"[^"]+"(*SKIP)(*F)|\n//g' ip.txt
3

-0777 to slurp entire input file as a single string, so this isn't suitable if the input file is very large
the s command returns number of substitutions made, which is used here to get the count of newlines
"[^"]+"(*SKIP)(*F) will cause newlines within double quotes to be ignored

You can use the below command if you want to count the last line even if it doesn't end with newline character.

perl -0777 -nE 'say scalar split /"[^"]+"(*SKIP)(*F)|\n/' ip.txt

回答3:

Same as anubhava but with GNU sed:

<infile sed '/"/ { :a; N; /"$/!ba; s/\n/ /g; }' | wc -l

Output:

来源：https://stackoverflow.com/questions/65035029/count-number-of-line-in-txt-file-when-new-line-is-inside-data

标签

regex

shell

Ubuntu

awk

sed