How to access the reference values of a HashSet without enumeration?

前端未结

关注

 2  641

I have this scenario in which memory conservation is paramount. I am trying to read in > 1 GB of Peptide sequences into memory and group peptide instances together that shar

相关标签:

2条回答

广开言路

2020-12-06 19:14

Use a Dictionary<string, Peptide>.

0 讨论(0)
发布评论:

提交评论
- 加载中...
情深已故

2020-12-06 19:25

Basically you could reimplement HashSet<T> yourself, but that's about the only solution I'm aware of. The Dictionary<Peptide, Peptide> or Dictionary<string, Peptide> solution is probably not that inefficient though - if you're only wasting a single reference per entry, I would imagine that would be relatively insignificant.

In fact, if you remove the hCode member from Peptide, that will safe you 4 bytes per object which is the same size as a reference in x86 anyway... there's no point in caching the hash as far as I can tell, as you'll only compute the hash of each object once, at least in the code you've shown.

If you're really desperate for memory, I suspect you could store the sequence considerably more efficiently than as a string. If you give us more information about what the sequence contains, we may be able to make some suggestions there.

I don't know that there's any particularly strong reason why HashSet doesn't permit this, other than that it's a relatively rare requirement - but it's something I've seen requested in Java as well...

0 讨论(0)
发布评论:

提交评论
- 加载中...