Minimal two column numeric input data for `sort` example, with distinct permutations

半城伤御伤魂 提交于 2019-12-08 12:17:36

问题


What's the least number of rows of two-column numeric input needed to produce four unique sort outputs for the following four options:
1. -sn -k1 2. -sn -k2 3. -sn -k1 -k2 4. -sn -k2 -k1 ?

Here's a 6 row example, (with 4 unique outputs):

6 5 
3 7 
6 3 
2 7 
4 4 
5 2

As a convenience, a function to count those four outputs given 2 columns of numbers, (requires the moreutils pee command), which prints the number of unique outputs:

# Usage: foo c1_1 c2_1 c1_2 c2_2 ...
foo() { echo "$@" | tr  -s '[:space:]' '\n' | paste - - | \
        pee "sort -sn -k1     | md5sum" \
            "sort -sn -k2     | md5sum" \
            "sort -sn -k1 -k2 | md5sum" \
            "sort -sn -k2 -k1 | md5sum" | \
        sort -u | wc -l ; }

So to count the unique permutations of this input:

8  5
3  5
8  4

Run this:

foo 8 5 3 1 8 3

Output:

2

(Only two unique outputs. Not enough...)


Note: This question was inspired by the obscurity of the current version of the sort manual, specifically COLUMNS=65 man sort | grep -A 17 KEYDEF | sed 3,18d. The info sort page's treatment of KEYDEFs is much better.

KEYDEFs are more useful than they might first seem. The -u or --unique switch works nicely with the KEYDEFs, and in effect allows sort to delete unwanted redundant lines, and therefore can furnish a more concise substitute for certain sed or awk scripts and similar pipelines.


回答1:


I can do it in 3 by varying the whitespace:

1 1
 2 1
1  2

Your foo function doesn't produce this kind of output, but since it was only a "convenience" and not a part of the question proper, I declare this answer correct and minimal!

Sneakier version:

2       1
11      1
2       2

(The last line contains a tab; the others don't.)

With the -s option, I can't exploit non-numeric comparisons, but then I can exploit the stability of the sort:

1   2
2   1
1   1

The 1 1 line goes above both of the others if both fields are compared numerically, regardless of which comparison is done first. The ordering of the two comparisons determines the ordering of the other two lines.

On the other hand, if one of the fields isn't used for comparison, the 1 1 line stays below one of the other lines (and which one that is depends on which field is used for comparison).



来源:https://stackoverflow.com/questions/38813821/minimal-two-column-numeric-input-data-for-sort-example-with-distinct-permutat

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!