How can I do string interning in C or C++?

走远了吗. 提交于 2020-01-01 09:12:27

问题


Is there something like intern() method in C or C++ like there is in Java ? If there isn't, how can I carry out string interning in C or C++?


回答1:


boost::flyweight< std::string > seems to be exactly what you're looking for.




回答2:


Is there something like intern() method in C like we have in Java ?

Not in the standard C library.

If there isn't, how to carry out string interning in C?

With great difficulty, I fear. The first problem is that "string" is not a well-defined thing in C. Instead you have char *, which might point at a zero-terminated string, or might just denote a character position. Then you've got the problem that some strings are embedded in other things ... or are stored on the stack. Both of which make interning impossible and/or meaningless. Then, there is the problem that C string literals are not guaranteed to be interned ... in the way that Java guarantees it. Finally, there is the problem that interning is a storage leak waiting to happen ... if the language is not garbage collected.

Having said that, the way to (attempt to) implement interning in C would be to create a hash table to hold the interned strings. You'd need to make it a precondition that you cannot intern a string unless it is either a literal or a string allocated in its own heap node. To address the storage leak issue, you'd need a per-string reference count to detect when an interned string can be discarded.




回答3:


What would string interning mean in a language which has value semantics? Interning is a mechanism to force object identity for references to strings with value identity. It's relevant in languages which use reference semantics and use object identity as the default comparison function. C++ uses value semantics by default, and types like std::string don't have identity, so interning makes no sense.

Some implementations (e.g. g++) may use a form of reference semantics for the string data, behind the scenes. Such an implementation could offer some sort of interning of that data, as an extension. (G++ doesn't, as far as I know, but does automatically "intern" empty strings.)

Most other implementations don't even use reference semantics internally. How would you intern an implementation using the small string optimization (like MS)? Where the data is literally in the class in some cases, and there is no dynamically allocated memory.



来源:https://stackoverflow.com/questions/10634918/how-can-i-do-string-interning-in-c-or-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!