kettle

Applying Pivot in Pentaho Kettle

烂漫一生 提交于 2019-12-24 13:55:43
问题 I'm using pentaho kettle 5.2.0 version. I'm trying to do pivots on my source data,here it is the structure of my source Billingid sku_id qty 1 0 1 1 0 12 1 0 6 1 0 1 1 0 2 1 57 2 1 1430 1 1 2730 1 2 3883 2 2 1456 1 2 571 9 2 9801 5 2 1010 1 And this is what I'm Expecting billingid 0 57 1430 2730 3883 1456 571 9801 1010 1 *******sum of qty****** 2 Any help would be much appreciated ..THANKS in advance 回答1: For denormaliser to work, you first have to Sort, and the Group the rows, to have the

ETL工具KETTLE通用知识简介

耗尽温柔 提交于 2019-12-24 08:32:30
一、Kettle连接不同数据库驱动jar包放置位置: Kettle软件部署在不同操作系统中,连接各种数据库的驱动jar包防止的位置是不同的,与操作系统有关,不同的操作系统上,驱动jar包放在对应的文件夹中即可,如下图: 二、Kettle中设置编码格式: 如果处理的数据中有中文,需要对中文设置编码格式,一般是utf8格式,彻底的修改格式的方法就是,修改spoon.bat或spoon.sh文件中增加如下信息:"-Dfile.encoding=UTF-8"; 三、Kettle作业与转换如何配合使用: 1、一个作业相当于一个主要任务项,在这个主线中可以调用其它若干个转换,每个转换中可以从作业这个主线中获取数据,然后将数据进行相应的处理操作,再将数据传递给作业主线,也可以在转换中单独获取数据------处理数据------输出数据; 2、并且如果对于需要设置变量的情况下,一般也是需要用到作业与转换的配合使用,因为变量在当前线中设置,不能再当前线中获取,需要到下一个线中才能够获取; 3、一个任务只能够有一个主作业,但是可以有多个子作业,主作业中可以调用转换,也可以调用子作业,具体是调用转换还是调用子作业,需要根据不同的需求,进行不同的定制; 四、Kettle文件资源库与数据库资源库内容如何转换: 1、文件资源库导入到数据库资源库: (1)首先,在kettle软件中登录到数据库资源库; (2

Running pentaho spoon UI in intellij

泄露秘密 提交于 2019-12-24 06:06:19
问题 I have been trying to setup dev environment for pentaho kettle but had some trouble to run Spoon.Java and I see lot of ClassNotFoundException com.google.api..... Maybe because the dependencies of plugins package is not under the class path of the UI package to run the spoon application. How can I effeciently setup dev environment and run Spoon tool from intellij for development and debugging ? There are older articles but it seems to be outdated since ivy/ant is not used anymore, only Maven

Error setting up initial repository in Pentaho Kettle

只谈情不闲聊 提交于 2019-12-23 05:16:24
问题 I'm setting up pentaho for the first time. It is able to see mysql but when I try and set up the initial repository it give me this error: org.pentaho.di.core.exception.KettleAuthException: Incorrect password or login Seems to want an admin (username) password. Don't see anything about this in the docs and web searches have not been fruitful. Any help appreciated. 回答1: If you want to create a new Kettle database repository using MySql, you need to follow some setup instructions. You can try

KETTLE学习记事本

爱⌒轻易说出口 提交于 2019-12-23 00:38:44
20191222 Kettle(大数据实时离线开发必备工具): https://www.bilibili.com/video/av53070265 1、http://kettle.pentaho.com/ 2、http://wiki.pentaho.com/ 3、http://infocenter.pentaho.com 4、kettl cook book 5、pentaho 3.2 data integration beginner’s guide 6、kettle solution 7、kettle 代码 8、kettle,下载地址:http://kettle.pentaho.com/ 9、傲飞数据整合平台 1.0.4,下载地址:http://pentahochina.com 来源: CSDN 作者: Hong.J.X. 链接: https://blog.csdn.net/weixin_37565541/article/details/103657651

Limit no. of rows in mongodb input

…衆ロ難τιáo~ 提交于 2019-12-22 08:19:21
问题 How to limit the no. of rows retrieved in mongodb input transformation used in kettle. I tried in mongodb input query with below queries but none of them are working : {"$query" : {"$limit" : 10}} or {"$limit" : 10} Please let me know where i am going wrong. Thanks, Deepthi 回答1: There are several query modification operators you can use. Their names are not totally intuitive and don't match the names of functions you would use in the Mongo shell, but they do the same sorts of things. In your

using variable names for a database connection in Pentaho Kettle

好久不见. 提交于 2019-12-22 00:09:29
问题 I am working on PDI kettle. Can we define a variable and use it in a database connection name. So that if in future if i need to change the connections in multiple transformations i would just change the variable value in kettle properties file? 回答1: Just use variables in the Database Connection . For instance ${DB_HostName} , and ${DB_Name} etc. Then just put it in your kettle.properties: DB_HostName=localhost You can see what fields that support variables by the S in the blue diamond. 来源:

java执行kettle 8.3 完整代码

早过忘川 提交于 2019-12-21 02:07:16
最近工作需要用到kettle,所以把kettle集成到java执行.同时借鉴的不少大神的文章,特此感谢! 基于kettle 8.3 开源版 集成java调用 java调用 Trans及 Job kettle基础的jar存放在lib在 使用说明 添加插件 有时kettle脚本需要自身插件,可将需要用到的插件统一放到一个目录下,然后指定目录. kettle插件存放路径 在跟目录下的plugins文件夹内. 添加插件方法: KettleConfig.getInstance().addPluginFolder(插件的目录路径); 添加Jndi 对于某些特殊的服务连接,例如:mysql8 就会用到 jndi 添加jndi文件夹路径 KettleConfig.getInstance().setJndi(jndi文件目录路径) 执行 Trans fname:执行的ktr脚本文件地址 params:参数 KettleImplement.runKtr(fname,params); 执行 Job fname:执行的job脚本文件地址 params:参数 KettleImplement.runKjb(filename, params); 执行 Trans 代码: /** * 执行ktr文件 * * @param fname ktr文件地址 * @param params 传入参数 * @return *

Pentaho的下载与安装及其简单实用

夙愿已清 提交于 2019-12-20 16:16:58
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 一、 首先去官网,因为是纯英文的。所以将其进行了翻译。 https://community.hitachivantara.com/s/article/data-integration-kettle 点击以下画红色框的线,对其进行下载。 下载完毕后,将其进行解压 Kettle是纯JAVA编程的开源软件,本地环境配置JDK1.7以上即可运行,解压后直接使用无需安装。 二、在环境变量中配置pentaho_java_home变量。值就是本地的jdk路径 配置完毕后,点击Spoon.bat 打开后耐心等待一会儿。 三、创建数据库的连接 点击转换,切换主对象树。可以看到DB连接。点击DB连接。 选择mysql的连接。输入相关的连接信息。 然后点击测试,出现以下的错误。 这是由于没有mysql的驱动包所导致的。所以要在pdi-ce-8.3.0.0-371\data-integration\lib下放入mysql的驱动包。找到对应的mysql版本的驱动包,如下载版本过低的驱动包会出现。 Unknown system variable 'query_cache_size' 这个错误,从而连接不到数据库。 我这下载的是mysql-connector-java-5.1.8.jar这个驱动包。可以看到测试连接成功。 点击确认 四

How to configure Database connection for production environment in Pentaho data integration Kettle transformation

喜夏-厌秋 提交于 2019-12-19 10:52:07
问题 I designed a ktr file for transformation. I need to configure the database connection details of production environment. How can I do this? Any suggestions? 回答1: I use environment variables. KETTLE_HOME KETTLE_JNDI_ROOT PATH=$PATH:$KETTLE_HOME Kettle home is just a link to directory. By default i have directory specially devoted to data-integration suite. It contains several versions of kettle. Example /opt/kettle/data-integration-4.4.0 (few old jobs made like several years ago) /opt/kettle