How to copy data from a Cassandra table to another structure for better performance

柔情痞子 提交于 2019-11-28 12:03:22

To echo what was said about the COPY command, it is a great solution for something like this.

However, I will disagree with what was said about the Bulk Loader, as it is infinitely harder to use. Specifically, because you need to run it on every node (whereas COPY needs to only be run on a single node).

To help COPY scale for larger data sets, you can use the PAGETIMEOUT and PAGESIZE parameters.

COPY invoices(id_invoice, year, id_client, type_invoice) 
  TO 'invoices.csv' WITH PAGETIMEOUT=40 AND PAGESIZE=20;

Using these parameters appropriately, I have used COPY to successfully export/import 370 million rows before.

For more info, check out this article titled: New options and better performance in cqlsh copy.

You can use cqlsh COPY command :
To copy your invoices data into csv file use :

COPY invoices(id_invoice, year, id_client, type_invoice) TO 'invoices.csv';

And Copy back from csv file to table in your case invoices_yr use :

COPY invoices_yr(id_invoice, year, id_client, type_invoice) FROM 'invoices.csv';

If you have huge data you can use sstable writer to write and sstableloader to load data faster. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated

An alternative to using COPY command (see other answers for examples) or Spark to migrate data is to create a materialized view to do the denormalization for you.

CREATE MATERIALIZED VIEW invoices_yr AS
       SELECT * FROM invoices
       WHERE id_client IS NOT NULL AND type_invoice IS NOT NULL AND id_client IS NOT NULL
       PRIMARY KEY ((type_invoice), year, id_client)
       WITH CLUSTERING ORDER BY (year DESC)

Cassandra will fill the table for you then so you wont have to migrate yourself. With 3.5 be aware that repairs don't work well (see CASSANDRA-12888).

Note: that Materialized Views are probably not best idea to use and has been changed to "experimental" status

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!