How to generate 10 million random strings in bash

廉价感情. 提交于 2021-02-08 12:07:06

问题


I need to make a big test file for a sorting algorithm. For that I need to generate 10 million random strings. How do I do that? I tried using cat on /dev/urandom but it keeps going for minutes and when I look in the file, there are only around 8 pages of strings. How do I generate 10 million strings in bash? Strings should be 10 characters long.


回答1:


This will not guarantee uniquness, but gives you 10 million random lines in a file. Not too fast, but ran in under 30 sec on my machine:

cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 10 | head -n 10000000 > file



回答2:


Using openssl:

#!/bin/bash

openssl rand -hex $(( 100000000 * 4 )) | \
while IFS= read -rn8 -d '' r; do
    echo "$r"
done



回答3:


Update, if you have shuf from GNU coreutils you can use:

shuf -i 1-10000000 > file

Takes 2 sec on my computer. (Thanks rici!)


You can use awk to generate sequential numbers and shuffle them with shuf:

awk 'BEGIN{for(i=1;i<10000001;i++){print i}}' | shuf > big-file.txt

This takes ~ 5 sec on my computer




回答4:


If they don't need to be uniq, you can do:

$ awk -v n=10000000 'BEGIN{for (i=1; i<=n; i++) printf "%010d\n", int(rand()*n)}' >big_file

That runs in about 3 seconds on my iMac.




回答5:


Don't generate it, download it. For example Nic funet fi has file 100Mrnd (size 104857600) in its /dev (just funet below). 10M rows, 10 bytes on each row is 100M but using xxd to convert from bin to hex (\x12 -> 12) we'll only need 50M bytes, so:

$ wget -S -O - ftp://funet/100Mrnd | head -c 50000000 | xxd -p | fold -w 10 > /dev/null
$ head -5 file
f961b3ef0e
dc0b5e3b80
513e7c37e1
36d2e4c7b0
0514e626e5

(replace funet with the domain name and path given and /dev/null with your desired filename.)



来源:https://stackoverflow.com/questions/47501917/how-to-generate-10-million-random-strings-in-bash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!