Fetch all rows in cassandra

后端 未结 3 1896
小蘑菇
小蘑菇 2020-12-03 17:34

I have a cassandra table containing 3 million rows. Now I am trying to fetch all the rows and write them to several csv files. I know it is impossible to perform selec

相关标签:
3条回答
  • 2020-12-03 18:10

    as I know, one improvement in cassandra 2.0 'on the driver side' is automatic-paging. you can do something like this :

    Statement stmt = new SimpleStatement("SELECT * FROM images LIMIT 3000000");
    stmt.setFetchSize(100);
    ResultSet rs = session.execute(stmt);
    
    // Iterate over the ResultSet here
    

    for more read Improvements on the driver side with Cassandra 2.0

    you can find the driver here.

    0 讨论(0)
  • 2020-12-03 18:20

    You could use Pig to read the data and store it into HDFS, then copy it out as a single file:

    In Pig:

    data = LOAD 'cql://your_ksp/your_table' USING CqlStorage();
    STORE data INTO '/path/to/output' USING PigStorage(',');
    

    From OS shell:

    hadoop fs -copyToLocal hdfs://hadoop_url/path/to/output /path/to/local/storage
    
    0 讨论(0)
  • 2020-12-03 18:33

    by default with select statement you can get only 100000 records.. so after that if you have to retrieve records you have to specify limit..

    Select * from tablename LIMIT 10000000 (in your case 3 million then specify it)...

    0 讨论(0)
提交回复
热议问题