data-science-experience

SQL Temporary Table Issue

阅读更多关于 SQL Temporary Table Issue

问题 I've created a temporary table DETAILS and follow the same syntax of creating and inserting in it. But I have not received any result set However, the CREATE and INSERT statements ran successfully and the Row was also affected in the INSERT statement . But the result set was empty when I ran the last SELECT statement to view the record . DROP TABLE DETAILS ; CREATE GLOBAL TEMPORARY TABLE DETAILS AS ( SELECT ins_id , firstname , pages FROM INSTRUCTOR)DEFINITION ONLY; INSERT INTO DETAILS SELECT

SQL Temporary Table Issue

阅读更多关于 SQL Temporary Table Issue

SystemML:Cannot import the submodule mllearn (and therefore Keras2DML function)

阅读更多关于 SystemML:Cannot import the submodule mllearn (and therefore Keras2DML function)

问题 I am using IBM Watson Studio (Default spark python environment) and trying to convert a Keras model to systemml DML and train it on Spark. !pip install systemml import systemml this executes just fine. But this - from systemml import mllearn throws SyntaxError: import * only allowed at module level dir(systemml) doesn't show mllearn. I tried to install it from http://www.romeokienzler.com/systemml-1.0.0-SNAPSHOT-python.tar.gz and https://sparktc.ibmcloud.com/repo/latest/systemml-1.0.0

Access on prem DB2 from DSX

阅读更多关于 Access on prem DB2 from DSX

问题 I am trying to access on prem DB2 data from DSX using a Python notebook in Jupyter. I have uploaded db2jcc.jar & license jar files to my home directory but how do I add the directory to the classpath ? Is there another to 回答1: You can alternatively use connector available on DSX to connect to DB2 on prem. from ingest import Connectors from pyspark.sql import SQLContext sqlContext = SQLContext(sc) DB2loadOptions = { Connectors.DB2.HOST : '***********', Connectors.DB2.PORT : '***********',

Access on prem DB2 from DSX

阅读更多关于 Access on prem DB2 from DSX

DSX Python import error : undefined symbol: PyUnicodeUCS2_AsUTF8String

阅读更多关于 DSX Python import error : undefined symbol: PyUnicodeUCS2_AsUTF8String

问题 On IBM DSX, I have a spark service instance on which I have installed a few newer versions of packages such as numpy . I am facing an issue with the import of numpy . The following code: import numpy raises this error message: ImportError: /gpfs/fs01/user/USERID/.local/lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyUnicodeUCS2_AsUTF8String The import used to work. 回答1: This is because of a mismatch in the Unicode characters representation between Python that you are

TensorFrames on IBM's Data Science Experience

阅读更多关于 TensorFrames on IBM's Data Science Experience

问题 This is a follow up from this question. I want to implement TensorFrames on IBM's Data Science Experience. I will consider it working if I can run all of the examples on the user guide for TensorFrames. I've had to import the following packages to do anything at all with TensorFrames: pixiedust.installPackage("http://central.maven.org/maven2/com/typesafe/scala-logging/scala-logging-slf4j_2.10/2.1.2/scala-logging-slf4j_2.10-2.1.2.jar") pixiedust.installPackage("http://central.maven.org/maven2

Can I use MeCab on IBM Data Science Experience

阅读更多关于 Can I use MeCab on IBM Data Science Experience

问题 I want to use Mecab on IBM Data Science Experience. https://pypi.python.org/pypi/mecab-python3 Is it possible? 回答1: I'm afraid not, or at least not easily. That Python package requires native mecab libraries, which are not installed in the environment where DSX notebooks are running. Neither do users have permission to install them using a package manager (yum). If you're willing to spend effort, you can try to put the libraries from a mecab rpm into the user file system. Then extend the

Is it possible for a spark job on bluemix to see a list of the other processes on the operating system?

阅读更多关于 Is it possible for a spark job on bluemix to see a list of the other processes on the operating system?

问题 A common approach for connecting to third party systems from spark is to provide the credentials for the systems as arguments to the spark script. However, this raises some questions about security. E.g. See this question Bluemix spark-submit -- How to secure credentials needed by my Scala jar Is it possible for a spark job running on bluemix to see a list of the other processes on the operating system? I.e. Can a job run the equivalent of ps -awx to inspect the processes running on the spark

AssertionError: Multiple .dist-info directories on Data Science Experience

阅读更多关于 AssertionError: Multiple .dist-info directories on Data Science Experience

问题 In a Python 3.5 notebook, backed by an Apache Spark service, I had installed BigDL 0.2 using pip . When removing that installation and trying to install version 0.3 of BigDL, I get this error: (linebreaks added for readability) AssertionError: Multiple .dist-info directories: /gpfs/fs01/user/scbc-4dbab79416a6ec-4cf890276e2b/.local/lib/python3.5/site-packages/BigDL-0.3.0.dist-info, /gpfs/fs01/user/scbc-4dbab79416a6ec-4cf890276e2b/.local/lib/python3.5/site-packages/BigDL-0.2.0.dist-info However