Datastore for large astrophysics simulation data

好久不见. 提交于 2019-12-03 04:01:47

EDIT

Rationale for my lack of explanation / etc.:

  • OP says: "[HDF5]'s a huge improvement over unformatted binary files with a custom header block (still give me nightmares), but I can't help but think there could be a better way to do this."

What does "better" mean? Better structured? He seems to allude to the "unformatted binary files" as being an issue - so maybe that's what he means by better. If so, he'll need something with some structure - hence the first couple suggestions.

  • OP says: "I imagine the sheer data size is the issue here, but is there some sort of datastore that can handle terabytes of binary data efficiently, or are binary files the only way at this point?"

Yes, there are several. Both structured, and "unstructured" - does he want structure, or is he happy to leave them in some sort of "unformatted binary format"? We still don't know - so I suggest checking out some Distributed File Systems.

  • OP says: "If it helps, we typically store data columnwise. That is, you have a block of all particle id's, block of all particle positions, block of particle velocites, etc. It's not the prettiest, but it is the fastest for doing something like a particle lookup in some volume."

Again, Does the OP want better structure, or doesn't he? Seems like he wants both - better structure AND faster.... maybe scaling OUT will give him this. This further reinforces the first few options I listed.

  • OP says (in comments): "I don't know if we can take the hit on io though."

Are there IO requirements? Cost restrictions? What are they?

We can't get something for nothing here. There is no "silver-bullet" storage solution. All we have to go on here for requirements is "lots of data" and "I don't know if I like the lack of structure, but I'm not willing to increase my IO to accommodate any additional structure"... so I don't know what kind of answer he's expecting. He hasn't listed a single complaint about the current solution he has other than the lack of structure - and he's already said he's not willing to pay any overhead to do anything about that... so.... ?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!