发表新帖

发表新帖

Why does Hadoop need classes like Text or IntWritable instead of String or Integer?

后端未结

关注

 4  1362

名媛妹妹 2020-12-23 16:38

Why does Hadoop need to introduce these new classes? They just seem to complicate the interface

4条回答

刺人心 (楼主)

2020-12-23 17:41

Some more good info:

they’ve got two features that are relevant

they have the “Writable” interface -they know how to write to a DataOutput stream and read from a DataInput stream -explicitly.

they have their contents updates via the set() operation. This lets you reuse the same value, repeatedly, without creating new instances. It’s a lot more efficient if the same mapper or reducer is called repeatedly: you just create your instances of the writables in the constructor and reuse them

In comparison, Java’s Serializable framework “magically” serializes objects -but it does it in a way that is a bit brittle and is generally impossible to read in values generated by older versions of a class. the Java Object stream is designed to send a graph of objects back -it has to remember every object reference pushed out already, and do the same on the way back. The writables are designed to be self contained.

This is from: http://hortonworks.com/community/forums/topic/why-hadoop-uses-default-longwritable-or-intwritable/

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题