etl

How do I populate a rational multi-table MySQL database from an existing one table database?

风格不统一 提交于 2019-12-31 06:18:40
问题 Basically have many huge delimited files that I know I can import as a table, but I need to map that data to an existing rational multi-table MySQL database. There should not be any conflict with datatypes, but I'm super new to this, so please point out anything I should be watching for. Clearly I'm not going to run this in production either until I know it works. Not 100% sure stackoverflow is the right place to ask a database question, but I couldn't find any other Stack Exchange that was a

SSIS - the value cannot be converted because of a potential loss of data

故事扮演 提交于 2019-12-30 17:21:46
问题 I am relatively new to SSIS. I am trying to extract information from an Oracle database using Microsoft OLEDB for Oracle and I am using this query: SELECT ID FROM Test I get an error message saying: the value cannot be converted because of a potential loss of data . If I change the query to the following then it works: SELECT '1' FROM Test I think it is failing because the ID is not an integer. However, the flat file connection manager shows that the OutputColumnWidth is 50. What am I doing

Expose Talend ETL Job as a Web Service

匆匆过客 提交于 2019-12-30 09:35:15
问题 I am currently evaluating Talend ETL (Talend Open Studio for Data Integration). I would like to know how / if i can expose an ETL Job as a Web Service. I know i can export jobs as web services and invoke them through a specific URL however, my goal is to be able to expose a specific WSDL with IN / OUT parameters. A sample use case would be: 1) Invoke WS in Talend ETL and pass XML with data 2) Talend ETL extracts the data from the XML and inserts them as variable(s) in the query to be executed

WildCards in SSIS Collection {not include} name xlsx

和自甴很熟 提交于 2019-12-29 01:43:05
问题 I have a process built in SSIS that loops through Excel files and Import data only from those that include name Report . My UserVariable used as Expression is: *Report*.xlsx and it works perfectly fine. Now I am trying to build similar loop but only for files that DOES NOT include Report in file name. Something like *<>Report*.xlsx Is it possible? Thanks for help! Matt 回答1: Unfortunately, you cannot achieve this using SSIS expression ( something like *[^...]*.xlsx ) you have to search for

How to convert result table to JSON array in MySQL

泪湿孤枕 提交于 2019-12-28 03:33:07
问题 I'd like to convert result table to JSON array in MySQL using preferably only plain MySQL commands. For example with query SELECT name, phone FROM person; | name | phone | | Jack | 12345 | | John | 23455 | the expected JSON output would be [ { "name": "Jack", "phone": 12345 }, { "name": "John", "phone": 23455 } ] Is there way to do that in plain MySQL? EDIT: There are some answers how to do this with e.g. MySQL and PHP, but I couldn't find pure MySQL solution. 回答1: New solution: Built using

SSIS reading LF as terminator when its set as CRLF

冷暖自知 提交于 2019-12-28 02:15:12
问题 using SSIS 2012. My flat file connection manager I have a delimited file where the row delimiter is set to CRLF , but when it processes the file, I have a text column that has an LF in it. This is causing it to read that as a row terminator causing it fail. Any ideas? 回答1: Before answering, i don't think that the column contains only LF because if the row delimiter is CRLF it will not consider it as delimiter. So it is probably CRLF , but i will give a solution for the two cases (CRLF or LF)

数据仓库经验小结

左心房为你撑大大i 提交于 2019-12-26 21:17:00
以主题域规划 DW 主题域包含了某方面决策者关注的事物。一个主题域通常会覆盖多个业务部门,例如产品主题域涉及到销售、财务、物流、采购等部门。 主题域下包括了主题,例如产品主题 域中包括成本、发运、库存等主题。 主题域模型是对业务模型的抽象,需要从决策者和管理者的角度反映企业业务模型。决策者不需要了解每个部门详细的业务细节;销售部门的管理者需要知道产品的库存和采购计划以安排销售,但是他不知道物流部和采购部的业务流程。因此在整合多业务部门数据同时,尽量减少 OLTP 数据库中的具体业务逻辑,以实现数据交付时更易于理解、更具效率。 EDW 开发 在开发模式上,一种是逐个开发多个数据集市,然后将这些数据集市合并成数据仓库。这种方法的优点是在初期效率高、见效快,但由于这些数据集市独立运作,后期的管理、整合就会碰到问题,最后往往成为一种 Hub 的形式,多个数据集市支撑着一个中心数据集市。 另一种开发模式是,先开发统一的数据仓库,然后由数据仓库支撑多个数据集市。但这种方式在大型企业实施困难,甚至是难以实现的。 实际上比较可行的是平行开发,每开始着手新的数据集市同时,调整数据仓库,将新的内容加入到数据仓库中。这种模式需要一定经验和对企业整体的了解,以便为数据仓库的下一次调整和扩充留下空间和弹性。 熟悉 Business Applications 企业中通常会有多种商业应用程序,比如 ERP 、

ETL JAR execution from BI server

感情迁移 提交于 2019-12-25 16:54:07
问题 I am trying to Execute a jar file from ETL - This works fine . When i am trying to call same ETL from xaction - This is showing errors as , ERROR 05-02 09:58:28,491 - Call Data Importer - org.pentaho.di.core.exception.KettleValueException: Javascript error:TypeError: Cannot call property runImageImpoter in object [JavaPackage com.MyTest.Data.Importer]. It is not a function, it is "object". (script#5) at org.pentaho.di.trans.steps.scriptvalues_mod.ScriptValuesMod.addValues(ScriptValuesMod.java

Janino Compile Exception : UDJC step

假装没事ソ 提交于 2019-12-25 11:50:36
问题 Thanks in advance for your support. In UDJC step, the following code gives me Janino exception, In processRow method Hashtable hastable=getConfigData() // This method return Hashtable Set set=hashtable.get("ERROR_2001").keySet(); ---> //hashtable.get("ERROR_2001"), This returns another hashtable Exception: A method named "keySet" is not declared in any enclosing class nor any supertype, nor through a static import In forums I could not find the turn around solution to fix this. I am using JDK

ETL工具Informatica开发流程 综合应用 电信通话计费系统开发项目案例10

一世执手 提交于 2019-12-25 11:42:52
一、准备数据源 在Oracle数据库中创建 OLTP用户导入源数据 oracle_oltp_data.sql 在Mysql数据库中创建表,插入产品相关数据 mysql_product_data.sql 用户表ods_cust_info(oltp) 地区表department(oltp) 通话表call_record(oltp) 产品表product(mysql) 二、需求: 电信运营情况分析:从各业务系统数据分析一下公司运营情况(只分析通话,不分析流量) 报表结果: 统计各维度与各指标的用户量与运营收入情况 维度(时间,地区,产品) 指标(用户数,运营收入) 三、开发思路 1:对地区维度表进行相应的处理 社区 ----> 端局 ------> 区局 村 镇 县 create table department_dimension as select a.dept_id dq_id,a.dept_name dq_name,a.level_no dq_level_no, b.dept_id ju_id,b.dept_name ju_name,b.level_no ju_level_no, c.dept_id sq_id,c.dept_name sq_name,c.level_no sq_level_no from (select dept_id,dept_name,level_no