bash sort - how do I sort using timestamp

廉价感情. 提交于 2019-12-11 02:26:42

问题


I need to sort a file using shell sort in linux. The sort needs to be based on timestamp values contained within each of file's rows. The timestamps are of irregular format and don’t specify the leading zeros to months, days, etc, so the sorts I am performing are not correct (i.e. their format is “M/D/YYYY H:MI:S AM”; so so “10/12/2012 12:16:18 PM” comes before “7/24/2012 12:16:18 PM”, which comes before “7/24/2012 12:17:18 AM”).

Is it possible to sort based on timestamps?

I am using the following command to sort my file:

sort -t= -k3 file.txt -o file.txt.sorted

(use equal sign as a separator => -t=; use 3rd column as a sort column => -k3)

A sample file is as follows:

<r id="abcd" t="10/12/2012 12:16:17 AM"><d><nv n="name" v="868" /><nv n="name0" v="73" /><nv n="name1" v="13815004" /></d></r>
<r id="defg" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="abcd" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="zxy" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="59542676" /></d></r>
<r id="fghj" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="38" /><nv n="name0" v="0" /><nv n="name1" v="3004537" /></d></r>
<r id="defg" t="7/24/2012 12:16:18 AM"><d><nv n="name" v="177" /><nv n="name0" v="0" /><nv n="name1" v="5888870" /></d></r>

回答1:


The linux date command does a fine job of parsing dates like this, and it can translate them into more sortable things, like simple unix-time integers.

Example:

cat file | while read line; do
    datestring=$(sed -e 's/^.* t="\([^"]*\)".*$/\1/' <<<"$line")
    echo "$(date -d "$datestring" +%s) $line"
done | sort -n

then you could pass that through the appropriate cut invocation if you want that unix timestamp removed again.




回答2:


sort is a nice tool but it doesn't have enough bells and whistles to take pseudo-xml apart, convert an attribute to a sensible time value, and then sort on it.

However, such tools do exist. While the best way to do this would probably be with an XSLT transform, if the file is really as consistent as your example command expects, you could extract the time values with cut -d'"' -f4, and you can convert each one to a more sensible format with date. For example (needs GNU date):

paste <(cut -d'"' -f4 file.txt | date -f- +%s) file.txt | sort -n | cut -f2-

which extracts the date-times, one per line; feeds them to date to convert them to seconds-since-epoch; pastes each timestamp on the beginning of each line; sorts the pasted result numerically, now with numeric timestamps at the beginning, and finally removes the timestamp to get the original file back.

Test:

$ cat >file.txt <<'EOF'
<r id="abcd" t="10/12/2012 12:16:17 AM"><d><nv n="name" v="868" /><nv n="name0" v="73" /><nv n="name1" v="13815004" /></d></r>
<r id="defg" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="abcd" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="zxy" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="59542676" /></d></r>
<r id="fghj" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="38" /><nv n="name0" v="0" /><nv n="name1" v="3004537" /></d></r>
<r id="defg" t="7/24/2012 12:16:18 AM"><d><nv n="name" v="177" /><nv n="name0" v="0" /><nv n="name1" v="5888870" /></d></r>
EOF
$ paste <(cut -d'"' -f4 file.txt | date -f- +%s) file.txt | sort -n | cut -f2-
<r id="defg" t="7/24/2012 12:16:18 AM"><d><nv n="name" v="177" /><nv n="name0" v="0" /><nv n="name1" v="5888870" /></d></r>
<r id="abcd" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="defg" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="0" /></d></r>
<r id="fghj" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="38" /><nv n="name0" v="0" /><nv n="name1" v="3004537" /></d></r>
<r id="zxy" t="7/24/2012 12:16:17 PM"><d><nv n="name" v="0" /><nv n="name0" v="0" /><nv n="name1" v="59542676" /></d></r>
<r id="abcd" t="10/12/2012 12:16:17 AM"><d><nv n="name" v="868" /><nv n="name0" v="73" /><nv n="name1" v="13815004" /></d></r>


来源:https://stackoverflow.com/questions/17844072/bash-sort-how-do-i-sort-using-timestamp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!