sort on pipe-delimited fields not behaving as expected

怎甘沉沦 提交于 2019-12-20 03:50:28

问题


Consider this tiny text file:

ab
a

If we run it through sort(1), we get

a
ab

because of course a comes before ab.

But now consider this file:

ab|c
a|c

If we run it through sort -t'|', we again expect a to sort before ab, but it does not! (Try it under your version of Unix and see.)

What I think is happening here is that the -t option to sort is not really delimiting fields -- it may be changing the way (say) the start of field 2 would be found, but it's not changing the way field 1 ends. a|c sorts after ab|c because '|' comes after 'b' in ASCII. (It's as if the -t'|' argument is ignored, because you get the same result without it.)

So is this a bug in sort or in my understanding of it? And is there a way to sort on the first pipe-delimited field properly?

This question came up in my attempt to answer another SO question, Join Statement omitting entries .


回答1:


sort's default behavior is to treat everything from field 1 to the end of the line as the sort key. If you want it to sort on field 1 first, then field 2, you need to specify that explicitly.

$ sort -k1,1 -k2,2 -t'|' <<< $'ab|c\na|c'
a|c
ab|c


来源:https://stackoverflow.com/questions/30905992/sort-on-pipe-delimited-fields-not-behaving-as-expected

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!