command line utility to print statistics of numbers in linux

后端未结

关注

 16  1626

無奈伤痛 2020-11-30 18:46

I often find myself with a file that has one number per line. I end up importing it in excel to view things like median, standard deviation and so forth.

Is there a

16条回答

时光说笑 (楼主)

2020-11-30 19:15
This is a breeze with R. For a file that looks like this:
```
1
2
3
4
5
6
7
8
9
10
```
Use this:
```
R -q -e "x <- read.csv('nums.txt', header = F); summary(x); sd(x[ , 1])"
```
To get this:
```
       V1       
 Min.   : 1.00  
 1st Qu.: 3.25  
 Median : 5.50  
 Mean   : 5.50  
 3rd Qu.: 7.75  
 Max.   :10.00  
[1] 3.02765
```
- The -q flag squelches R's startup licensing and help output
- The -e flag tells R you'll be passing an expression from the terminal
- x is a data.frame - a table, basically. It's a structure that accommodates multiple vectors/columns of data, which is a little peculiar if you're just reading in a single vector. This has an impact on which functions you can use.
- Some functions, like summary(), naturally accommodate data.frames. If x had multiple fields, summary() would provide the above descriptive stats for each.
- But sd() can only take one vector at a time, which is why I index x for that command (x[ , 1] returns the first column of x). You could use apply(x, MARGIN = 2, FUN = sd) to get the SDs for all columns.
0 讨论(0)

查看其它16个回答
发布评论:

提交评论
- 加载中...