I have a Cassandra 1.1.2 installation on my system as a single node cluster and have three keyspaces: hotel
, student
and employee
. I w
I don't recommend use stable2json
and json2sstable
to load a large amout of data. It uses jackson API to create the dataset and transform it to json format. It implies to load all of the data in memory to create a unique json representation.
It is ok for a few amount of data, now imagine to load a large dataset of more than 40 million of rows, about 25GB of data, these tools simply doesn't work well. I already asked datastax guys about it without clarification.
In case of large datasets, just copy cassandra data files from a cluster to another may solve the problem. In my case I'm was trying to migrate from Cassandra 1.0.6 cluster to a 1.2.1, the data files were not compatible between this versions.
What is the solution? I'm just writing my own export/import tool to solve this. I hope to post a link for this tool soon.