Best way to create a hashmap of arraylist

谁说我不能喝 提交于 2019-12-27 17:38:26

问题


I have one million rows of data in .txt format. the format is very simple. For each row:

user1,value1
user2,value2
user3,value3
user1,value4
...

You know what I mean. For each user, it could appear many times, or appear only once (you never know). I need to find out all the values for each user. Because user may appear randomly, I used Hashmap to do it. That is: HashMap(key: String, value: ArrayList). But to add data to the arrayList, I have to constantly use HashMap get(key) to get the arrayList, add value to it, then put it back to HashMap. I feel it is not that very efficient. Anybody knows a better way to do that?


回答1:


You don't need to re-add the ArrayList back to your Map. If the ArrayList already exists then just add your value to it.

An improved implementation might look like:

Map<String, Collection<String>> map = new HashMap<String, Collection<String>>();

while processing each line:

String user = user field from line
String value = value field from line

Collection<String> values = map.get(user);
if (values==null) {
    values = new ArrayList<String>();
    map.put(user, values)
}
values.add(value);

Follow-up April 2014 - I wrote the original answer back in 2009 when my knowledge of Google Guava was limited. In light of all that Google Guava does, I now recommend using its Multimap instead of reinvent it.

Multimap<String, String> values = HashMultimap.create();
values.put("user1", "value1");
values.put("user2", "value2");
values.put("user3", "value3");
values.put("user1", "value4");

System.out.println(values.get("user1"));
System.out.println(values.get("user2"));
System.out.println(values.get("user3"));

Outputs:

[value4, value1]
[value2]
[value3]



回答2:


Use Multimap from Google Collections. It allows multiple values for the same key

https://google.github.io/guava/releases/19.0/api/docs/com/google/common/collect/Multimap.html




回答3:


The ArrayList values in your HashMap are references. You don't need to "put it back to HashMap". You're operating on the object that already exists as a value in the HashMap.




回答4:


If you don't want to import a library.

package util;    

import java.util.ArrayList;    
import java.util.HashMap;    
import java.util.List;    

/**    
 * A simple implementation of a MultiMap. This implementation allows duplicate elements in the the    
 * values. (I know classes like this are out there but the ones available to me didn't work).    
 */    
public class MultiMap<K, V> extends HashMap<K, List<V>> {    

  /**    
   * Looks for a list that is mapped to the given key. If there is not one then a new one is created    
   * mapped and has the value added to it.    
   *     
   * @param key    
   * @param value    
   * @return true if the list has already been created, false if a new list is created.    
   */    
  public boolean putOne(K key, V value) {    
    if (this.containsKey(key)) {    
      this.get(key).add(value);    
      return true;    
    } else {    
      List<V> values = new ArrayList<>();    
      values.add(value);    
      this.put(key, values);    
      return false;    
    }    
  }    
}    



回答5:


Since Java 8 you can use map.computeIfAbsent

https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-

Collection<String> values = map.computeIfAbsent(user, k -> new ArrayList<>());
values.add(value);



回答6:


i think what you want is the Multimap. You can get it from apache's commons collection, or google-collections.

http://commons.apache.org/collections/

http://code.google.com/p/google-collections/

"collection similar to a Map, but which may associate multiple values with a single key. If you call put(K, V) twice, with the same key but different values, the multimap contains mappings from the key to both values."




回答7:


I Could not find any easy way. MultiMap is not always an option available. So I wrote something this.

public class Context<K, V> extends HashMap<K, V> {

    public V addMulti(K paramK, V paramV) {
        V value = get(paramK);
        if (value == null) {
            List<V> list = new ArrayList<V>();
            list.add(paramV);
            put(paramK, paramV);
        } else if (value instanceof List<?>) {
            ((List<V>)value).add(paramV);
        } else {
            List<V> list = new ArrayList<V>();
            list.add(value);
            list.add(paramV);
            put(paramK, (V) list);
        }
        return paramV;
    }
}



回答8:


it would be faster if you used a LinkedList instead of an ArrayList, as the ArrayList will need to resize when it nears capacity.

you will also want to appropriately estimate the capacity of the wrapping collection (HashMap or Multimap) you are creating to avoid repetitive rehashing.




回答9:


As already mentioned, MultiMap is your best option.

Depending on your business requirements or constraints on the data file, you may want to consider doing a one-off sorting of it, to make it more optimised for loading.



来源:https://stackoverflow.com/questions/1010879/best-way-to-create-a-hashmap-of-arraylist

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!