map reduce program to implement data structure in hadoop framework

落爺英雄遲暮 提交于 2019-12-25 10:20:00

问题


This is a data structure implementation in Hadoop. I want to implement indexing in Hadoop using map-reduce programming. Part 1 = I want to store this text file each word using index number in a table. [Able to complete] Part 2 = Now I want to perform the hashing for this newly created table [not able to complete] 1st part I am able to complete but 2nd part I m facing difficulty  Suppose if I have a text file containing 3 lines: how is your job how is your family hi how are you

I want to store this text file using indexing. I have map-reduce code that returns index value of every word, this index value I am able to store in index table (hash table) Output that contains index values of every word: how 0, how 14, is 3, is 18, job 12, your 7,

Now to store in hash table apply hashing for every word (index value) with modules (number of distinct elements in file) let say 4. For every index value of word and apply hash function (modules'%') to store in hash table. If there is a collision for same location then go to next location and store it.

  0%4=0(store 'how' at hash index 0)
  14%4=2(store 'how' at has index 2)
  18%4=2(store 'is' at hash index 3 because of collision) 
  7%4=3 (store 'your' at index 4 because of collision)

回答1:


you can create Hashtable object and put the key and value.

Hashtable hashtable = new Hashtable(); 

How to find key? Ans. you have total distinct words count and word's index. key = index % no of distinct word value = word

Before insert record in hashtable, check collision is occur or not for that key. How can I check collision occur? Ans.

boolean collision=hashtable.containsKey(key);  

if collision is true, then linearly check for key+1, key+2,...and when you get collision is false, insert the key and value in hashtable using below line.

hashtable.put(key,value);


来源:https://stackoverflow.com/questions/29486393/map-reduce-program-to-implement-data-structure-in-hadoop-framework

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!