data-science-experience

SQL Temporary Table Issue

南楼画角 提交于 2020-12-13 13:09:48
问题 I've created a temporary table DETAILS and follow the same syntax of creating and inserting in it. But I have not received any result set However, the CREATE and INSERT statements ran successfully and the Row was also affected in the INSERT statement . But the result set was empty when I ran the last SELECT statement to view the record . DROP TABLE DETAILS ; CREATE GLOBAL TEMPORARY TABLE DETAILS AS ( SELECT ins_id , firstname , pages FROM INSTRUCTOR)DEFINITION ONLY; INSERT INTO DETAILS SELECT

SQL Temporary Table Issue

和自甴很熟 提交于 2020-12-13 13:09:45
问题 I've created a temporary table DETAILS and follow the same syntax of creating and inserting in it. But I have not received any result set However, the CREATE and INSERT statements ran successfully and the Row was also affected in the INSERT statement . But the result set was empty when I ran the last SELECT statement to view the record . DROP TABLE DETAILS ; CREATE GLOBAL TEMPORARY TABLE DETAILS AS ( SELECT ins_id , firstname , pages FROM INSTRUCTOR)DEFINITION ONLY; INSERT INTO DETAILS SELECT

SystemML:Cannot import the submodule mllearn (and therefore Keras2DML function)

为君一笑 提交于 2020-07-09 03:01:08
问题 I am using IBM Watson Studio (Default spark python environment) and trying to convert a Keras model to systemml DML and train it on Spark. !pip install systemml import systemml this executes just fine. But this - from systemml import mllearn throws SyntaxError: import * only allowed at module level dir(systemml) doesn't show mllearn. I tried to install it from http://www.romeokienzler.com/systemml-1.0.0-SNAPSHOT-python.tar.gz and https://sparktc.ibmcloud.com/repo/latest/systemml-1.0.0

Access on prem DB2 from DSX

纵然是瞬间 提交于 2020-01-07 09:36:50
问题 I am trying to access on prem DB2 data from DSX using a Python notebook in Jupyter. I have uploaded db2jcc.jar & license jar files to my home directory but how do I add the directory to the classpath ? Is there another to 回答1: You can alternatively use connector available on DSX to connect to DB2 on prem. from ingest import Connectors from pyspark.sql import SQLContext sqlContext = SQLContext(sc) DB2loadOptions = { Connectors.DB2.HOST : '***********', Connectors.DB2.PORT : '***********',

Access on prem DB2 from DSX

故事扮演 提交于 2020-01-07 09:35:09
问题 I am trying to access on prem DB2 data from DSX using a Python notebook in Jupyter. I have uploaded db2jcc.jar & license jar files to my home directory but how do I add the directory to the classpath ? Is there another to 回答1: You can alternatively use connector available on DSX to connect to DB2 on prem. from ingest import Connectors from pyspark.sql import SQLContext sqlContext = SQLContext(sc) DB2loadOptions = { Connectors.DB2.HOST : '***********', Connectors.DB2.PORT : '***********',

DSX Python import error : undefined symbol: PyUnicodeUCS2_AsUTF8String

こ雲淡風輕ζ 提交于 2020-01-04 05:35:23
问题 On IBM DSX, I have a spark service instance on which I have installed a few newer versions of packages such as numpy . I am facing an issue with the import of numpy . The following code: import numpy raises this error message: ImportError: /gpfs/fs01/user/USERID/.local/lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyUnicodeUCS2_AsUTF8String The import used to work. 回答1: This is because of a mismatch in the Unicode characters representation between Python that you are

TensorFrames on IBM's Data Science Experience

我们两清 提交于 2019-12-25 15:31:12
问题 This is a follow up from this question. I want to implement TensorFrames on IBM's Data Science Experience. I will consider it working if I can run all of the examples on the user guide for TensorFrames. I've had to import the following packages to do anything at all with TensorFrames: pixiedust.installPackage("http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-slf4j_2.10/2.1.2/scala-logging-slf4j_2.10-2.1.2.jar") pixiedust.installPackage("http://central.maven.org/maven2

Can I use MeCab on IBM Data Science Experience

僤鯓⒐⒋嵵緔 提交于 2019-12-25 00:37:09
问题 I want to use Mecab on IBM Data Science Experience. https://pypi.python.org/pypi/mecab-python3 Is it possible? 回答1: I'm afraid not, or at least not easily. That Python package requires native mecab libraries, which are not installed in the environment where DSX notebooks are running. Neither do users have permission to install them using a package manager (yum). If you're willing to spend effort, you can try to put the libraries from a mecab rpm into the user file system. Then extend the

Is it possible for a spark job on bluemix to see a list of the other processes on the operating system?

旧城冷巷雨未停 提交于 2019-12-24 17:06:34
问题 A common approach for connecting to third party systems from spark is to provide the credentials for the systems as arguments to the spark script. However, this raises some questions about security. E.g. See this question Bluemix spark-submit -- How to secure credentials needed by my Scala jar Is it possible for a spark job running on bluemix to see a list of the other processes on the operating system? I.e. Can a job run the equivalent of ps -awx to inspect the processes running on the spark

AssertionError: Multiple .dist-info directories on Data Science Experience

↘锁芯ラ 提交于 2019-12-24 12:05:03
问题 In a Python 3.5 notebook, backed by an Apache Spark service, I had installed BigDL 0.2 using pip . When removing that installation and trying to install version 0.3 of BigDL, I get this error: (linebreaks added for readability) AssertionError: Multiple .dist-info directories: /gpfs/fs01/user/scbc-4dbab79416a6ec-4cf890276e2b/.local/lib/python3.5/site-packages/BigDL-0.3.0.dist-info, /gpfs/fs01/user/scbc-4dbab79416a6ec-4cf890276e2b/.local/lib/python3.5/site-packages/BigDL-0.2.0.dist-info However