I have a file with numbers on each line:
0101
1010
1311
0101
1311
431
1010
431
420
I want have a hash with the number of occurrences of eac
ID = -> x { x } # Why is the identity function not in the core lib?
f = <<-HERE
0101
1010
1311
0101
1311
431
1010
431
420
HERE
Hash[f.lines.map(&:to_i).group_by(&ID).map {|n, ns| [n, ns.size] }]
# { 101 => 2, 1010 => 2, 1311 => 2, 431 => 2, 420 => 1 }
You simply group the numbers by themselves using Enumerable#group_by, which gives you something like
{ 101 => [101, 101], 420 => [420] }
And then you Enumerable#map the value arrays to their lengths, i.e. [101, 101] becomes 2. Then just convert it back to a Hash using Hash::[].
However, if you are willing to use a third-party library, it becomes even more trivial, because if you use a MultiSet data structure, the answer falls out naturally. (A MultiSet is like a Set, except that an item can be added multiple times and the MultiSet will keep count of how often an item was added – which is exactly what you want.)
require 'multiset' # Google for it, it's so old that it isn't available as a Gem
Multiset[*f.lines.map(&:to_i)]
# => #
Yes, that's it.
That's the beautiful thing about using the right data-structure: your algorithms become massively simpler. Or, in this particular case, the algorithm just vanishes.
I've written more about using MultiSets for solving this exact problem at
group_by example here.)