When not to use to_sym in Ruby?

对着背影说爱祢 提交于 2021-01-28 03:09:28

问题


I have a large dataset from an analytics provider.

It arrives in JSON and I parse it into a hash, but due to the size of the set I'm ballooning to over a gig in memory usage. Almost everything starts as strings (a few values are numerical), and while of course the keys are duplicated many times, many of the values are repeated as well.

So I was thinking, why not symbolize all the (non-numerical) values, as well?

I've found some discusion of potential problems, but I figure it would be nice to have a comprehensive description for Ruby, since the problems seem dependent on the implementation of the interning process (what happens when you symbolize a string).

I found this talking about Java: Is it good practice to use java.lang.String.intern()?

  • The interning process can be expensive
  • Interned strings are never de-allocated, resulting in a memory leak

(Except there's some contention on that last point.)

So, can anyone give a detailed explanation of when not to intern strings in Ruby?


回答1:


  • When the list of things in question is an open set (i.e., dynamic, has no fixed inventory), you should not convert them into symbols. Each symbol created will not be garbage collected, and will cause memory leak.
  • When the list of things in question is a closed set (i.e., static, has a fixed inventory), you should better convert them into symbols. Each symbol will be created only once, and will be reused. That will save memory.



回答2:


The interning process can be expensive

there is always a tradeoff between memory and computing power we have to choose. so try some best practices out there and benchmark to figure out what's right for you. a few suggestions I like to mention..

  • symbols are an excellent choice for a hash key

    {name: "my name"}
    
  • Freeze Strings to save memory, try to keep a small string pool

    person[:country] = "USA".freeze
    
  • have fun with Ruby GC tuning.

Interned strings are never de-allocated, resulting in a memory leak

  • ruby 2.2 introduced a symbol garbage collection. so this concern is no longer valid. however, overuse of frozen strings and symbols will decrease the performance.


来源:https://stackoverflow.com/questions/16289455/when-not-to-use-to-sym-in-ruby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!