Hadoop - composite key

早过忘川 提交于 2019-11-28 21:39:00

Just compose your own Writable. In your example a solution could look like this:

public class UserPageWritable implements WritableComparable<UserPageWritable> {

  private String userId;
  private String pageId;

  @Override
  public void readFields(DataInput in) throws IOException {
    userId = in.readUTF();
    pageId = in.readUTF();
  }

  @Override
  public void write(DataOutput out) throws IOException {
    out.writeUTF(userId);
    out.writeUTF(pageId);
  }

  @Override
  public int compareTo(UserPageWritable o) {
    return ComparisonChain.start().compare(userId, o.userId)
        .compare(pageId, o.pageId).result();
  }

}

Although I think your IDs could be a long, here you have the String version. Basically just the normal serialization over the Writable interface, note that it needs the default constructor so you should always provide one.

The compareTo logic tells obviously how to sort the dataset and also tells the reducer what elements are equal so they can be grouped.

ComparisionChain is a nice util of Guava.

Don't forget to override equals and hashcode! The partitioner will determine the reducer by the hashcode of the key.

You could write your own class that implements Writable and WritableComparable that would compare your two fields.

Pierre-Luc Bertrand

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!