Say I have a large file with many rows and many columns. I\'d like to find out how many rows and columns I have using bash.
Simple row count is $(wc -l "$file")
. Use $(wc -lL "$file")
to show both the number of lines and the number of characters in the longest line.
Perl solution:
perl -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file
If your input file is comma-separated:
perl -F, -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file
output:
max columns: 5
rows: 2
-a
autosplits input line to @F
array
$#F
is the number of columns -1
-F,
field separator of , instead of whitespace
$.
is the line number (number of rows)
head -1 file.tsv |head -1 train.tsv |tr '\t' '\n' |wc -l
take the first line, change tabs (or you can use ',' instead of '\t' for commas), count the number of lines.
awk 'BEGIN{FS=","}END{print "COLUMN NO: "NF " ROWS NO: "NR}' file
You can use any delimiter as field separator and can find numbers of ROWS and columns
If counting number of columns in the first is enough, try the following:
awk -F'\t' '{print NF; exit}' myBigFile.tsv
where \t
is column delimiter.
For rows you can simply use wc -l file
-l
stands for total line
for columns uou can simply use head -1 file | tr ";" "\n" | wc -l
Explanation
head -1 file
Grabbing the first line of your file, which should be the headers,
and sending to it to the next cmd through the pipe
| tr ";" "\n"
tr
stands for translate.
It will translate all ;
characters into a newline character.
In this example ;
is your delimiter.
Then it sends data to next command.
wc -l
Counts the total number of lines.