Write Drill query output to csv (or some other format)

后端未结

关注

 5  1325

没有蜡笔的小新 2020-12-17 23:27

I\'m using drill in embedded mode, and I can\'t figure out how to save query output other than copy and pasting it.

5条回答

抹茶落季 (楼主)

2020-12-17 23:29
If you are using SQLLINE use !record .

If you are using a set of queries, you need to specify the exact schema to use. This can be done using th Use schema command. Unfortunately, you must also not use your root schema. Ensure that you have created the correct directory on your file system and use the proper storage configuration as well. An example configuration is below. After this, you can create a csv via java using the SQL driver, or in a tool such as Pentaho to generate a CSV. With the proper specification, it is possible to use the REST query tool at localhost:8047/query as well. The query to produce a csv at /out/data/csv is below after the configuration example.

Storage Configuration
```
{
  "type": "file",
  "enabled": true,
  "connection": "file:///",
  "config": null,
  "workspaces": {
    "root": {
      "location": "/out",
      "writable": false,
      "defaultInputFormat": null
    },
    "jsonOut": {
      "location": "/out/data/json",
      "writable": true,
      "defaultInputFormat": "json"
    },
    "csvOut": {
      "location": "/out/data/csv",
      "writable": true,
      "defaultInputFormat": "csv"
    }
  },
  "formats": {
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "csv": {
      "type": "text",
      "extensions": [
        "csv"
      ],
      "delimiter": ","
    }
  }
}
```
Query
```
USE fs.csvOut;
ALTER SESSION SET `store.format`='csv';
CREATE TABLE fs.csvOut.mycsv_out
AS SELECT * FROM fs.`my_records_in.json`;
```
This will produce at least one CSV and possibly many with different header specifications at /out/data/csv/mycsv_out.

Each file should follow the following format:
```
\d+_\d+_\d+.csv
```
Note: While the query result can be read as a single CSV the resulting CSVs (if there are more than one) cannot as the number of headers will vary. Drop the file as a Json file and read with code or later with Drill or another tool if this is the case.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...