问题
I have one txt file which has below data
Name mobile url message text
test11 1234567890 www.google.com "Data Test New
Date:27/02/2020
Items: 1
Total: 3
Regards
ABC DATa
Ph:091 : 123456789"
test12 1234567891 www.google.com "Data Test New one
Date:17/02/2020
Items: 26
Total: 5
Regards
user test
Ph:091 : 433333333"
Now you can see my last column data has new line character. so when I use below command
awk 'END{print NR}' file.txt
it is giving my length is 15 but actually line length is 3 . Please suggest command for the same
Edited Part: As per the answer given the below script is not working if there's no newline at the end of input file
awk -v RS='"[^"]*"' '{gsub(/\n/, " ", RT); ORS=RT} END{print NR "\n"}' test.txt
Also my file may have 3-4 Million of records . So converting file to unix format will take time and that is not my preference. So Please suggest some optimum solution which should work in both case
head 5.csv | cat -A
Above command is giving me the output
Name mobile url message text^M$
回答1:
Using gnu-awk
you can do this using a custom RS
:
awk -v RS='"[^"]*"' '{gsub(/(\r?\n){2,}/, "\n"); n+=gsub(/\n/, "&")}
END {print n}' <(sed '$s/$//' file)
15001
Here:
-v RS='"[^"]*"'
: Uses this regex as input record separator. Which matches a double quoted stringn+=gsub(/\n/, "&")
: Dummy replace\n
with itself and counts\n
in variablen
END {print n}
: Printsn
in the endsed '$s/$//' file
: For last line adds a newline (in case it is missing)
Code Demo
回答2:
With perl
, assuming last line always ends with a newline character
$ perl -0777 -nE 'say s/"[^"]+"(*SKIP)(*F)|\n//g' ip.txt
3
-0777
to slurp entire input file as a single string, so this isn't suitable if the input file is very large- the
s
command returns number of substitutions made, which is used here to get the count of newlines "[^"]+"(*SKIP)(*F)
will cause newlines within double quotes to be ignored
You can use the below command if you want to count the last line even if it doesn't end with newline character.
perl -0777 -nE 'say scalar split /"[^"]+"(*SKIP)(*F)|\n/' ip.txt
回答3:
Same as anubhava but with GNU sed:
<infile sed '/"/ { :a; N; /"$/!ba; s/\n/ /g; }' | wc -l
Output:
3
来源:https://stackoverflow.com/questions/65035029/count-number-of-line-in-txt-file-when-new-line-is-inside-data