I have a computer with 1 MB of RAM and no other local storage. I must use it to accept 1 million 8-digit decimal numbers over a TCP connection, sort them, and then send the
To represent the sorted array one can just store the first element and the difference between adjacent elements. In this way we are concerned with encoding 10^6 elements that can sum up to at most 10^8. Let's call this D. To encode the elements of D one can use a Huffman code. The dictionary for the Huffman code can be created on the go and the array updated every time a new item is inserted in the sorted array (insertion sort). Note that when the dictionary changes because of a new item the whole array should be updated to match the new encoding.
The average number of bits for encoding each element of D is maximized if we have equal number of each unique element. Say elements d1, d2, ..., dN in D each appear F times. In that case (in worst case we have both 0 and 10^8 in input sequence) we have
sum(1<=i<=N) F. di = 10^8
where
sum(1<=i<=N) F = 10^6, or F=10^6/N and the normalized frequency will be p= F/10^=1/N
The average number of bits will be -log2(1/P) = log2(N). Under these circumstances we should find a case that maximizes N. This happens if we have consecutive numbers for di starting from 0, or, di= i-1, therefore
10^8=sum(1<=i<=N) F. di = sum(1<=i<=N) (10^6/N) (i-1) = (10^6/N) N (N-1)/2
i.e.
N <= 201. And for this case average number of bits is log2(201)=7.6511 which means we will need around 1 byte per input element for saving the sorted array. Note that this doesn't mean D in general cannot have more than 201 elements. It just sows that if elements of D are uniformly distributed, it cannot have more than 201 unique values.