apache-drill

How to use apache drill do page search

霸气de小男生 提交于 2019-12-12 05:14:02
问题 i want use apache drill to do a page search. But it just provide a limit key words,I don't know how to write a good sql.Do any anybody can help me?Thank you! 回答1: Drill supports both LIMIT and OFFSET operators. So, pagination can be achieved using these. Sample query: SELECT * FROM cp.`employee.json` order by employee_id LIMIT 20 OFFSET 10 ROWS Some important ponits from Drill docs: The OFFSET number must be a positive integer and cannot be larger than the number of rows in the underlying

Making storage plugin on Apache Drill to HDFS

爷,独闯天下 提交于 2019-12-12 04:23:47
问题 I'm trying to make storage plugin for Hadoop (hdfs) and Apache Drill. Actually I'm confused and I don't know what to set as port for hdfs:// connection, and what to set for location. This is my plugin: { "type": "file", "enabled": true, "connection": "hdfs://localhost:54310", "workspaces": { "root": { "location": "/", "writable": false, "defaultInputFormat": null }, "tmp": { "location": "/tmp", "writable": true, "defaultInputFormat": null } }, "formats": { "psv": { "type": "text", "extensions

Can't create Drill storage plugin for oracle

浪尽此生 提交于 2019-12-12 04:17:29
问题 I want to create storage plugin in drill for oracle jdbc. I copy ojdbc7.jar to apache-drill-1.3.0/jars/3rdparty path and add drill.exec.sys.store.provider.local.path = "/mypath" to dill.override.conf . when I want to create a new storage plugin with below configuration: { "type": "jdbc", "enabled": true, "driver": "oracle.jdbc.OracleDriver", "url":"jdbc:oracle:thin:user/pass@x.x.x.x:1521/orcll" } I get unable to create/update storage error. I am using Redhat 7 & Drill version - 1.3. in

Apache Drill - First start time is high

↘锁芯ラ 提交于 2019-12-11 15:42:54
问题 I am running SQL on MongoDB backend using Drill. I am getting response time ~500ms. But most of that time is spent on "First start" phase. Actual processing in drill is taking much less time(~50ms). Why does "First start" take so much time? I would like to know what drill in doing in that phase and if possible optimise that. Fragment profile Operator profile 回答1: After the first query Drill creates a lot of cache objects to improve the further work, see Generated Code Cache [1], [2], [3] for

Unable to create Storage plugin for Mysql Apache Drill

夙愿已清 提交于 2019-12-11 13:39:32
问题 With help from the documentation : http://drill.apache.org/docs/rdbms-storage-plugin/ I've been trying to create a storage plugin for Mysql in Apache Drill, I tried multiple jdbc drivers : mysql-connector-java-5.1.39-bin , sqlserverjdbc but I always get the error: Please retry: error (unable to create/update storage) my configuration is as follows : { "type":"jdbc", "driver":"mysql-connector-java-5.1.39-bin", "url":"jdbc:mysql://localhost:3306", "username":"root", "password":"password",

Converting HL7 v2 to JSON

有些话、适合烂在心里 提交于 2019-12-11 12:50:06
问题 I am looking to convert HL7 v2 (older EDI format) messages to JSON, so I could make them processable under Apache Drill and compressible under Parquet. I looked into HAPI, but I am not having luck finding utility for non-XML HL7 to JSON conversion. Does anyone have a suggestion or a reference to a library? 回答1: Just use HAPI to convert to XML. The code below requires Saxon, because the XML-to-JSON requires XSLT 2.0, but if you already have a method to convert XML to JSON, then you just need

What are the limitations of apache drill?

試著忘記壹切 提交于 2019-12-11 12:42:54
问题 what are the limitations of Apache Drill? where it fails when compared to Apache hive/impala? 回答1: My view on drill holistically, One of the main advantage of Apache drill is you can query across multiple databases. You just need to configure the sources & directly query them. Thats the biggest advantage of Apache drill. It was proved that its a best query among many other technologies.(check reference 2) I cannot call it as limitations but since its a query engine like it just takes the sql

Apache drill losing unicode in TSVs

╄→гoц情女王★ 提交于 2019-12-11 05:40:09
问题 I'm using the text/tsv storage plugin with Apache drill and the output tsv files have ? for unicode characters. If I use the JSON storage plugin, the unicode is fine. Something like: URL: http://localhost:8047/query.json Payload: { "queryType":"SQL", "query": "CREATE TABLE st.`repo`.`test` AS SELECT * FROM st.`repo`.`unicode_data`" } 回答1: Set the JVM file encoding and this is fixed. JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8 来源: https://stackoverflow.com/questions/43151313/apache-drill-losing

drill-jdbc vs dril-jdbc-all jar

て烟熏妆下的殇ゞ 提交于 2019-12-11 02:54:20
问题 There are two JDBC drivers for apache-drill: drill-jdbc & drill-jdbc-all Maven dependencies: <dependency> <groupId>org.apache.drill.exec</groupId> <artifactId>drill-jdbc</artifactId> <version>1.4.0</version> </dependency> and <dependency> <groupId>org.apache.drill.exec</groupId> <artifactId>drill-jdbc-all</artifactId> <version>1.4.0</version> </dependency> I am using drill-jdbc and things are working fine. But according to drill's documentation for JDBC, driver is located at: <drill

Generating parquet files - differences between R and Python

给你一囗甜甜゛ 提交于 2019-12-10 21:48:44
问题 We have generated a parquet file in Dask (Python) and with Drill (R using the Sergeant packet ). We have noticed a few issues: The format of the Dask (i.e. fastparquet ) has a _metadata and a _common_metadata files while the parquet file in R \ Drill does not have these files and have parquet.crc files instead (which can be deleted). what is the difference between these parquet implementations? 回答1: (only answering to 1), please post separate questions to make it easier to answer) _metadata