pentaho

Pentaho的下载与安装及其简单实用

夙愿已清 提交于 2019-12-20 16:16:58
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 一、 首先去官网,因为是纯英文的。所以将其进行了翻译。 https://community.hitachivantara.com/s/article/data-integration-kettle 点击以下画红色框的线,对其进行下载。 下载完毕后,将其进行解压 Kettle是纯JAVA编程的开源软件,本地环境配置JDK1.7以上即可运行,解压后直接使用无需安装。 二、在环境变量中配置pentaho_java_home变量。值就是本地的jdk路径 配置完毕后,点击Spoon.bat 打开后耐心等待一会儿。 三、创建数据库的连接 点击转换,切换主对象树。可以看到DB连接。点击DB连接。 选择mysql的连接。输入相关的连接信息。 然后点击测试,出现以下的错误。 这是由于没有mysql的驱动包所导致的。所以要在pdi-ce-8.3.0.0-371\data-integration\lib下放入mysql的驱动包。找到对应的mysql版本的驱动包,如下载版本过低的驱动包会出现。 Unknown system variable 'query_cache_size' 这个错误,从而连接不到数据库。 我这下载的是mysql-connector-java-5.1.8.jar这个驱动包。可以看到测试连接成功。 点击确认 四

Pentaho JNDI source name as parameter (Multi-Tennant)

て烟熏妆下的殇ゞ 提交于 2019-12-20 07:36:53
问题 I have googled this for the last half an hour, and found hits for pentaho parameters etc but nothing that appears to ask or answer this question. I have a set of reports that are the same for each customer, but need to connect to different databases depending upon the customer who is running the report. So my idea is to pass the JNDI data source name to the report at runtime as a parameter, so that the customer will connect to the correct database. Is this possible, or is there a better way

pentaho spoon/pid: how to move files to folders with different name everytime?

穿精又带淫゛_ 提交于 2019-12-20 05:28:23
问题 I have new text files every month from where I extract the data and do some transformations. In the end of every month, I need to move these files to a folder with current date in name. Which means, the destination folder's name is different everytime. I made a step before move files that creates a folder and its name is current date (exp: 2019-06-01, 2019-07-01), but then on move files step, I don't know how to specify the destination folder. Guess "wildcard" is only used for source...

How to configure Database connection for production environment in Pentaho data integration Kettle transformation

喜夏-厌秋 提交于 2019-12-19 10:52:07
问题 I designed a ktr file for transformation. I need to configure the database connection details of production environment. How can I do this? Any suggestions? 回答1: I use environment variables. KETTLE_HOME KETTLE_JNDI_ROOT PATH=$PATH:$KETTLE_HOME Kettle home is just a link to directory. By default i have directory specially devoted to data-integration suite. It contains several versions of kettle. Example /opt/kettle/data-integration-4.4.0 (few old jobs made like several years ago) /opt/kettle

Pentaho DI - JSON Nested File Output

痴心易碎 提交于 2019-12-19 10:23:05
问题 I have a requirement where I need to fetch records from multiple tables. The primary table is having one-to-many relationship to other tables. My data source is Oracle DB. Oracle db is having the specified tables. One called Student other one is Subjects. For sample, I have a Student Table where "Student_Id" is the Primary Key and other columns like firstname, lastName etc. Each student have registered for multiple subjects so we have student_id is the foreign key to the Subjects table.

JSON path parent object, or equivalent MongoDB query

梦想与她 提交于 2019-12-19 09:26:18
问题 I am selecting nodes in a JSON input but can't find a way to include parent object detail for each array entry that I am querying. I am using pentaho data integration to query the data using JSON input form a mongodb input. I have also tried to create a mongodb query to achieve the same but cannot seem to do this either. Here are the two fields/paths that display the data: $.size_break_costs[*].size $.size_break_costs[*].quantity Here is the json source format: { "_id" : ObjectId(

Missing plugins found while loading a transformation on Kettle

回眸只為那壹抹淺笑 提交于 2019-12-17 21:36:52
问题 I receive this error whenever I run my extraction from the command line, not in the Spoon UI. Missing plugins found while loading a transformation Step : MongoDbInput at org.pentaho.di.job.entries.trans.JobEntryTrans.getTransMeta(JobEntryTrans.java:1200) at org.pentaho.di.job.entries.trans.JobEntryTrans.execute(JobEntryTrans.java:643) at org.pentaho.di.job.Job.execute(Job.java:714) at org.pentaho.di.job.Job.execute(Job.java:856) ... 4 more Caused by: org.pentaho.di.core.exception

How to get last 7 days data from current datetime to last 7 days in sql server

妖精的绣舞 提交于 2019-12-17 07:30:12
问题 Hi I am loading table A data from sql server to mysql using pentaho when loading data i need to get only last 7 days data from sql server A table to mysql In sql server createddate column data type is like datetime AND In mysql created_on column datatype is timestamp Here I used below query but i am getting only 5 days data Please help me in this issue select id, NewsHeadline as news_headline, NewsText as news_text, state, CreatedDate as created_on from News WHERE CreatedDate BETWEEN GETDATE(

Fetch Data From Remote Database Every Hour

老子叫甜甜 提交于 2019-12-14 04:21:24
问题 Yesterday i Download Pentaho BI Server Data Integration Report Designer Than I connect report designer to the remote database and fetch table and draw chart of that data successfully. My Question is ,I want to run that file (which i create in report designing ) every hour by fetching the new data from remote database can you please guide me step by step how to do it because i am new in all those stuff. 回答1: I will answer My own Question. So In order to schedule Job in Data Integration You

How to retrieve OUT parameter from MYSQL stored procedure to stream in Pentaho Data Integration (Kettle)?

一个人想着一个人 提交于 2019-12-14 00:24:14
问题 I am unable to get the OUT parameter of a MySQL procedure call in the output stream with the procedure call step of Pentaho Kettle. I'm having big trouble retrieving OUT parameter from MYSQL stored procedure to stream. I think it's maybe a kind of bug becouse it only occurs with Integer out parameter, it works with String out parameter. The exception I get is: Invalid value for getLong() - ' I think the parameters are correctly set as you can see in the ktr. You can replicate the bug in this