How to produce cartesian product in bash?

半腔热情 提交于 2019-11-27 06:45:52

问题


I want to produce such file (cartesian product of [1-3]X[1-5]):

1 1
1 2
1 3
1 4
1 5
2 1
2 2
2 3
2 4
2 5
3 1
3 2
3 3
3 4
3 5

I can do this using nested loop like:

for i in $(seq 3) 
do
  for j in $(seq 5)
  do
      echo $i $j
  done
done

is there any solution without loops?


回答1:


Combine two brace expansions!

$ printf "%s\n" {1..3}" "{1..5}
1 1
1 2
1 3
1 4
1 5
2 1
2 2
2 3
2 4
2 5
3 1
3 2
3 3
3 4
3 5

This works by using a single brace expansion:

$ echo {1..5}
1 2 3 4 5

and then combining with another one:

$ echo {1..5}+{a,b,c}
1+a 1+b 1+c 2+a 2+b 2+c 3+a 3+b 3+c 4+a 4+b 4+c 5+a 5+b 5+c



回答2:


A shorter (but hacky) version of Rubens's answer:

join -j 999999 -o 1.1,2.1 file1 file2

Since the field 999999 most likely does not exist it is considered equal for both sets and therefore join have to do the Cartesian product. It uses O(N+M) memory and produces output at 100..200 Mb/sec on my machine.

I don't like the "shell brace expansion" method like echo {1..100}x{1..100} for large datasets because it uses O(N*M) memory and can when used careless bring your machine to knees. It is hard to stop because ctrl+c does not interrupts brace expansion which is done by the shell itself.




回答3:


The best alternative for cartesian product in bash is surely -- as pointed by @fedorqui -- to use parameter expansion. However, in case your input that is not easily producible (i.e., if {1..3} and {1..5} does not suffice), you could simply use join.

For example, if you want to peform the cartesian product of two regular files, say "a.txt" and "b.txt", you could do the following. First, the two files:

$ echo -en {a..c}"\tx\n" | sed 's/^/1\t/' > a.txt
$ cat a.txt
1    a    x
1    b    x
1    c    x

$ echo -en "foo\nbar\n" | sed 's/^/1\t/' > b.txt
$ cat b.txt
1    foo
1    bar

Notice the sed command is used to prepend each line with an identifier. The identifier must be the same for all lines, and for all files, so the join will give you the cartesian product -- instead of putting aside some of the resultant lines. So, the join goes as follows:

$ join -j 1 -t $'\t' a.txt b.txt | cut -d $'\t' -f 2-
a    x    foo
a    x    bar
b    x    foo
b    x    bar
c    x    foo
c    x    bar

After both files are joined, cut is used as an alternative to remove the column of "1"s formerly prepended.



来源:https://stackoverflow.com/questions/23363003/how-to-produce-cartesian-product-in-bash

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!