Pretty file size in Ruby?

谁都会走 提交于 2019-11-28 19:22:28

How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:

example:

Filesize.from("12502343 B").pretty      # => "11.92 MiB"

http://rubygems.org/gems/filesize

If you use it with Rails - what about standard Rails number helper?

http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human_size

number_to_human_size(number, options = {})

?

I agree with @David that it's probably best to use an existing solution, but to answer your question about what you're doing wrong:

  1. The primary error is dividing s by self rather than the other way around.
  2. You really want to divide by the previous s, so divide s by 1024.
  3. Doing integer arithmetic will give you confusing results, so convert to float.
  4. Perhaps round the answer.

So:

class Integer
  def to_filesize
    {
      'B'  => 1024,
      'KB' => 1024 * 1024,
      'MB' => 1024 * 1024 * 1024,
      'GB' => 1024 * 1024 * 1024 * 1024,
      'TB' => 1024 * 1024 * 1024 * 1024 * 1024
    }.each_pair { |e, s| return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
  end
end

lets you:

1.to_filesize
# => "1.0B"
1020.to_filesize
# => "1020.0B" 
1024.to_filesize
# => "1.0KB" 
1048576.to_filesize
# => "1.0MB"

Again, I don't recommend actually doing that, but it seems worth correcting the bugs.

This is my solution:

def filesize(size)
  units = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'Pib', 'EiB']

  return '0.0 B' if size == 0
  exp = (Math.log(size) / Math.log(1024)).to_i
  exp = 6 if exp > 6 

  '%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end

Compared to other solutions it's simpler, more efficient, and generates a more proper output.

Format

Both to_filesize and to_human have issues with big numbers. format_mb has a weird case where for example '1 MiB' is considered '1024 KiB' which is something some people might want, but certainly not me.

    origin:       filesize    to_filesize      format_mb       to_human
       0 B:          0.0 B           0.0B            0 b         0.00 B
       1 B:          1.0 B           1.0B            1 b         1.00 B
      10 B:         10.0 B          10.0B           10 b        10.00 B
    1000 B:       1000.0 B        1000.0B         1000 b      1000.00 B
     1 KiB:        1.0 KiB          1.0KB         1024 b        1.00 KB
   1.5 KiB:        1.5 KiB          1.5KB       1536.0 b        1.50 KB
    10 KiB:       10.0 KiB         10.0KB      10.000 kb       10.00 KB
   100 KiB:      100.0 KiB        100.0KB     100.000 kb      100.00 KB
  1000 KiB:     1000.0 KiB       1000.0KB    1000.000 kb     1000.00 KB
     1 MiB:        1.0 MiB          1.0MB    1024.000 kb        1.00 MB
     1 Gib:        1.0 GiB          1.0GB    1024.000 mb        1.00 GB
     1 TiB:        1.0 TiB          1.0TB    1024.000 gb        1.00 TB
     1 PiB:        1.0 Pib          ERROR    1024.000 tb        1.00 PB
     1 EiB:        1.0 EiB          ERROR    1024.000 pb        1.00 EB
     1 ZiB:     1024.0 EiB          ERROR    1024.000 eb          ERROR
     1 YiB:  1048576.0 EiB          ERROR 1048576.000 eb          ERROR

Performance

Also, it has the best performance.

                      user     system      total        real
filesize:         2.740000   0.000000   2.740000 (  2.747873)
to_filesize:      3.560000   0.000000   3.560000 (  3.557808)
format_mb:        2.950000   0.000000   2.950000 (  2.949930)
to_human:         5.770000   0.000000   5.770000 (  5.783925)

I tested each implementation with a realistic random number generator:

def numbers
  Enumerator.new do |enum|
    1000000.times do
      exp = rand(5)
      num = rand(1024 ** exp)
      enum.yield num
    end
  end
end

You get points for adding a method to Integer, but this seems more File specific, so I would suggest monkeying around with File, say by adding a method to File called .prettysize().

But here is an alternative solution that uses iteration, and avoids printing single bytes as float :-)

def format_mb(size)
  conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
  scale = 1024;

  ndx=1
  if( size < 2*(scale**ndx)  ) then
    return "#{(size)} #{conv[ndx-1]}"
  end
  size=size.to_f
  [2,3,4,5,6,7].each do |ndx|
    if( size < 2*(scale**ndx)  ) then
      return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
    end
  end
  ndx=7
  return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end

@Darshan Computing's solution is only partial here. Since the hash keys are not guaranteed to be ordered this approach will not work reliably. You could fix this by doing something like this inside the to_filesize method,

 conv={
      1024=>'B',
      1024*1024=>'KB',
      ...
 }
 conv.keys.sort.each { |s|
     next if self >= s
     e=conv[s]
     return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
 }

This is what I ended up doing for a similar method inside Float,

 class Float
   def to_human
     conv={
       1024=>'B',
       1024*1024=>'KB',
       1024*1024*1024=>'MB',
       1024*1024*1024*1024=>'GB',
       1024*1024*1024*1024*1024=>'TB',
       1024*1024*1024*1024*1024*1024=>'PB',
       1024*1024*1024*1024*1024*1024*1024=>'EB'
     }
     conv.keys.sort.each { |mult|
        next if self >= mult
        suffix=conv[mult]
        return "%.2f %s" % [ self / (mult / 1024), suffix ]
     }
   end
 end
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!