phoenix

How to export table schemas in apache phoenix?

余生颓废 提交于 2019-12-10 03:28:52
问题 I'd like to export the schema of an existing table in apache phoenix. Are there some commands or tools to do the same thing as show create table TABLE_NAME in mysql? thx 回答1: Using the Phoenix sqlline tool: !describe <table> 回答2: Apache Phoenix is typically used as a SQL front or interface to a NoSQL DB (like Hadoop). Perhaps it would help if you were more specific about the challenges you are trying to address. 回答3: Since "native" HBase is schema-less (you can only specify column families),

Simple integer comparison in HBase

随声附和 提交于 2019-12-08 12:18:03
问题 I am trying out a very simple example in HBase. Following is how I create table and put data: create 'newdb3','data' put 'newdb3','row1','data:name','Thexxx Beatles' put 'newdb3','row2','data:name','The Beatles' put 'newdb3','row3','data:name','Beatles' put 'newdb3','row4','data:name','Thexxx' put 'newdb3','row1','data:duration',400 put 'newdb3','row2','data:duration',300 put 'newdb3','row3','data:duration',200 put 'newdb3','row4','data:duration',100 scan 'newdb3', {COLUMNS => 'data:name',

How to connect to a Kerberos-secured Apache Phoenix data source with WildFly?

丶灬走出姿态 提交于 2019-12-08 07:21:56
问题 I have recently spent several weeks trying to get WildFly to successfully connect to a Kerberized Apache Phoenix data source. There is a surprisingly limited amount of documentation on how to do this, but now that I have cracked it, I'm sharing. Environment: WildFly 9+. An equivalent JBoss version should also work (but untested). WildFly 8 does not contain the required org.jboss.security.negotiation.KerberosLoginModule class (but you can hack it, see Kerberos sql server datasource in Wildfly

PhoenixOutputFormat not found when running a Spark Job on CDH 5.4 with Phoenix 4.5

别等时光非礼了梦想. 提交于 2019-12-07 14:26:37
问题 I managed to configure Phoenix 4.5 on Cloudera CDH 5.4 by recompiling the source code. sqlline.py works well, but there are problems with spark. spark-submit --class my.JobRunner \ --master yarn --deploy-mode client \ --jars `ls -dm /myapp/lib/* | tr -d ' \r\n'` \ /myapp/mainjar.jar The /myapp/lib folders contains the phoenix core lib, which contains class org.apache.phoenix.mapreduce.PhoenixOutputFormat . But it seems that the driver/executor cannot see it. Exception in thread "main" java

Phoenix创建视图和索引--基于HBase

给你一囗甜甜゛ 提交于 2019-12-06 23:00:46
一、HBase shell命令 1. 进入hbase shell # Step1:进入hbase的安装路径的bin目录 cd /home/gulfmoon/apps/hbase-1.2.4/bin # Step2:启动hbase shell hbase shell 启动成功后显示的信息: 2. 查看hbase中所有的表 !list 3. help大法 二、Phoenix创建视图和索引 1. 启动phoenix客户端 # Step1:进入phoenix安装路径的bin目录下 cd /home/gulfmoon/apps/apache-phoenix-4.14.0-HBase-1.2-bin/bin # Step2: 启动phoenix ./start-sqlline.sh 启动成功的界面如下: 2. 查看hbase所有表和phoenix视图和索引清单 !tables 3. help大法 4. 创建视图 # Delete phoenix view DROP VIEW IF EXISTS VIEW_TEST CASCADE; # Create phoenix view CREATE VIEW VIEW_TEST ( ROWKEY VARCHAR PRIMARY KEY, "F1".TEST_ID UNSIGNED_LONG -- F1 is column family ) AS

Phoenix duplicate record -- 查询数据重复的原因和解决方案

大兔子大兔子 提交于 2019-12-06 12:25:40
问题说明 issue A: 开启参数后(phoenix.stats.enabled=true),使用Phoenix SQL查询数据,出现重复(查出来的数据多余HBase实际存储的内容) issue B:关闭参数后(phoenix.stats.enabled=false),Phoenix SQL性能降低。 环境 Phoenix 版本:phoenix-4.8.0-HBase-1.1 本文目的 探究stats对查询的影响 参数描述 phoenix.stats.enabled: 是否启用统计(默认值true)。 参数功能 在stats开启的情况下,major compaction以及region split 会自动调用StatisticsCollector的updateStatistic方法,收集Region的key信息,计算guideposts并写入到system.stats表中。 参数影响(并行度) Phoenix SQL通过将查询划分成更多的scan、并行执行scan来提升性能。 在guideposts之间的数据都会当成一个chunk,每一个chunk对应一个scan,通过并行执行scan来获取查询性能的提升。 chunk 的大小可以通过 phoenix.stats.guidepost.width来配置。更小的chunk意味着更多的scan&更大的并发度

HBase读写的几种方式(二)spark篇

半世苍凉 提交于 2019-12-06 11:34:37
HBase读写的几种方式(二)spark篇 https://www.cnblogs.com/swordfall/p/10517177.html 分类: HBase undefined 1. HBase读写的方式概况 主要分为: 纯Java API读写HBase的方式; Spark读写HBase的方式; Flink读写HBase的方式; HBase通过Phoenix读写的方式; 第一种方式是HBase自身提供的比较原始的高效操作方式,而第二、第三则分别是Spark、Flink集成HBase的方式,最后一种是第三方插件Phoenix集成的JDBC方式,Phoenix集成的JDBC操作方式也能在Spark、Flink中调用。 注意: 这里我们使用HBase2.1.2版本,spark2.4版本,scala-2.12版本,以下代码都是基于该版本开发的。 2. Spark上读写HBase Spark上读写HBase主要分为新旧两种API,另外还有批量插入HBase的,通过Phoenix操作HBase的。 2.1 spark读写HBase的新旧API 2.1.1 spark写数据到HBase 使用旧版本saveAsHadoopDataset保存数据到HBase上。 /** * saveAsHadoopDataset */ def writeToHBase(): Unit ={ //

Apache Phoenix介绍(SQL on HBase)

时光总嘲笑我的痴心妄想 提交于 2019-12-06 08:53:36
1. Phoenix定义 Phoenix最早是saleforce的一个开源项目,后来成为Apache基金的顶级项目。 Phoenix是构建在HBase上的一个SQL层,能让我们用标准的JDBC APIs而不是HBase客户端APIs来创建表,插入数据和对HBase数据进行查询。 put the SQL back in NoSQL Phoenix完全使用Java编写,作为HBase内嵌的JDBC驱动。Phoenix查询引擎会将SQL查询转换为一个或多个HBase扫描,并编排执行以生成标准的JDBC结果集。直接使用HBase API、协同处理器与自定义过滤器,对于简单查询来说,其性能量级是毫秒,对于百万级别的行数来说,其性能量级是秒。 HBase的查询工具有很多,如:Hive、Tez、Impala、Spark SQL、Phoenix等。 Phoenix通过以下方式使我们可以少写代码,并且性能比我们自己写代码更好: 将SQL编译成原生的HBase scans。 确定scan关键字的最佳开始和结束 让scan并行执行 ... 使用Phoenix的公司 Paste_Image.png 2. 历史演进 3.0/4.0 release ARRAY Type . 支持标准的JDBC数组类型 Sequences . 支持 CREATE/DROP SEQUENCE, NEXT VALUE FOR,

Apache Phoenix - How can start the query server and thin client on Kerberos cluster

北慕城南 提交于 2019-12-06 04:06:17
问题 I have recently spent several days trying to run the phoenix thin (queryserver.py and sqlline-thin.py) and thick via zookeeper to secure cluster.But, I could not able to start or connect the phoenix service on secure cluster. Faced Below issues on phoenix thin and thick clients 17/09/27 08:41:47 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for your platform... using builtin-java classes where applicable Error: java.lang.RuntimeException: java.lang.NullPointerException

PHOENIX SPARK - Load Table as DataFrame

拈花ヽ惹草 提交于 2019-12-06 02:47:01
问题 I have created a DataFrame from a HBase Table (PHOENIX) which has 500 million rows. From the DataFrame I created an RDD of JavaBean and use it for joining with data from a file. Map<String, String> phoenixInfoMap = new HashMap<String, String>(); phoenixInfoMap.put("table", tableName); phoenixInfoMap.put("zkUrl", zkURL); DataFrame df = sqlContext.read().format("org.apache.phoenix.spark").options(phoenixInfoMap).load(); JavaRDD<Row> tableRows = df.toJavaRDD(); JavaPairRDD<String, AccountModel>