Suitable Java data structure for parsing large data file

吃可爱长大的小学妹 提交于 2019-12-24 16:10:06

问题


I have a rather large text file (~4m lines) I'd like to parse and I'm looking for advice about a suitable data structure in which to store the data. The file contains lines like the following:

Date        Time    Value
2011-11-30  09:00   10
2011-11-30  09:15   5
2011-12-01  12:42   14
2011-12-01  19:58   19
2011-12-01  02:03   12

I want to group the lines by date so my initial thought was to use a TreeMap<String, List<String>> to map the date to the rest of the line but is a TreeMap of Lists a ridiculous thing to do? I suppose I could replace the String key with a date object (to eliminate so many string comparisons) but it's the List as a value that I'm worried might be unsuitable.

I'm using a TreeMap because I want to iterate the keys in date order.


回答1:


is a TreeMap of Lists a ridiculous thing to do?

Conceptually not, but it is going to be very memory-inefficient (both because of the Map and because of the List). You're looking at an overhead of 200% or more. Which may or may not be acceptable, depending on how much memory you have to waste.

For a more memory-efficient solution, create a class that has fields for every column (including a Date), put all those in a List and sort it (ideally using quicksort) when you're done reading.




回答2:


There's nothing wrong with using a List as the value for a Map. All of those <> look ugly, but it's perfectly fine to put a generics class inside of a generics class.

Instead of using a String as the key, it would probably be better to use java.util.Date because the keys are dates. This will allow the TreeMap to more accurately sort the dates. If you store the dates as Strings, then the TreeMap may not properly sort the dates (they will be sorted as strings, not as "real" dates).

Map<Date, List<String>> map = new TreeMap<Date, List<String>>();



回答3:


There is no objection against using Lists. Though in your case maybe a List<Integer> as values of the Map would be appropriate.



来源:https://stackoverflow.com/questions/8324523/suitable-java-data-structure-for-parsing-large-data-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!