azure-databricks

databricks configure using cmd and R

[亡魂溺海] 提交于 2020-05-07 09:17:09
问题 I am trying to use databricks cli and invoke the databricks configure That's how I do it from cmd somepath>databricks configure --token Databricks Host (should begin with https://): my_https_address Token: my_token I want to invoke the same command using R. So I did: tool.control <- c('databricks configure --token' ,'my_https_address' ,'my_token') shell(tool.control) I get the following error Error in system(command, as.integer(flag), f, stdout, stderr, timeout) : character string expected as

Azure Databricks cluster init script - install python wheel

微笑、不失礼 提交于 2020-04-18 04:00:52
问题 I have a python script that mounts a storage account in databricks and then installs a wheel from the storage account. I am trying to run it as a cluster init script but it keeps failing. My script is of the form: #/databricks/python/bin/python mount_point = "/mnt/...." configs = {....} source = "...." if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()): dbutils.fs.mount(source = source, mount_point = mount_point, extra_configs = configs) dbutils.library.install("dbfs

Azure Databricks cluster init script - Install wheel from mounted storage

吃可爱长大的小学妹 提交于 2020-04-18 04:00:51
问题 I have a python wheel uploaded to an azure storage account that is mounted in a databricks service. I'm trying to install the wheel using a cluster init script as described in the databricks documentation. My storage is definitely mounted and my file path looks correct to me. Running the command display(dbutils.fs.ls("/mnt/package-source")) in a notebook yields the result: path: dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl name: parser-3.0-py3-none-any.whl I have tried to install the

Azure Databricks cluster init script - Install wheel from mounted storage

点点圈 提交于 2020-04-18 04:00:40
问题 I have a python wheel uploaded to an azure storage account that is mounted in a databricks service. I'm trying to install the wheel using a cluster init script as described in the databricks documentation. My storage is definitely mounted and my file path looks correct to me. Running the command display(dbutils.fs.ls("/mnt/package-source")) in a notebook yields the result: path: dbfs:/mnt/package-source/parser-3.0-py3-none-any.whl name: parser-3.0-py3-none-any.whl I have tried to install the

Azure Databricks cluster init script - install python wheel

安稳与你 提交于 2020-04-18 04:00:39
问题 I have a python script that mounts a storage account in databricks and then installs a wheel from the storage account. I am trying to run it as a cluster init script but it keeps failing. My script is of the form: #/databricks/python/bin/python mount_point = "/mnt/...." configs = {....} source = "...." if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()): dbutils.fs.mount(source = source, mount_point = mount_point, extra_configs = configs) dbutils.library.install("dbfs

Write dataframe to blob using azure databricks

强颜欢笑 提交于 2020-04-16 02:27:47
问题 Is there any link or sample code where we can write dataframe to azure blob storage using python (not using pyspark module). 回答1: Below is the code snippet for writing (dataframe) CSV data directly to an Azure blob storage container in an Azure Databricks Notebook. # Configure blob storage account access key globally spark.conf.set( "fs.azure.account.key.%s.blob.core.windows.net" % storage_name, sas_key) output_container_path = "wasbs://%s@%s.blob.core.windows.net" % (output_container_name,

Can I transform a complex json object to multiple rows in a dataframe in Azure Databricks using pyspark?

雨燕双飞 提交于 2020-03-25 16:46:14
问题 I have some json that's being read from a file where each row looks something like this: { "id": "someGuid", "data": { "id": "someGuid", "data": { "players": { "player_1": { "id": "player_1", "locationId": "someGuid", "name": "someName", "assets": { "assetId1": { "isActive": true, "playlists": { "someId1": true, "someOtherId1": false } }, "assetId2": { "isActive": true, "playlists": { "someId1": true } } } }, "player_2": { "id": "player_2", "locationId": "someGuid", "name": "someName", "dict"

Saving spark dataframe from azure databricks' notebook job to azure blob storage causes java.lang.NoSuchMethodError

a 夏天 提交于 2020-02-06 10:14:06
问题 I have created a simple job using notebook in azure databricks. I am trying to save a spark dataframe from notebook to azure blob storage. Attaching the sample code import traceback from pyspark.sql import SparkSession from pyspark.sql.types import StringType # Attached the spark submit command used # spark-submit --master local[1] --packages org.apache.hadoop:hadoop-azure:2.7.2, # com.microsoft.azure:azure-storage:3.1.0 ./write_to_blob_from_spark.py # Tried with com.microsoft.azure:azure

Saving spark dataframe from azure databricks' notebook job to azure blob storage causes java.lang.NoSuchMethodError

痞子三分冷 提交于 2020-02-06 10:04:40
问题 I have created a simple job using notebook in azure databricks. I am trying to save a spark dataframe from notebook to azure blob storage. Attaching the sample code import traceback from pyspark.sql import SparkSession from pyspark.sql.types import StringType # Attached the spark submit command used # spark-submit --master local[1] --packages org.apache.hadoop:hadoop-azure:2.7.2, # com.microsoft.azure:azure-storage:3.1.0 ./write_to_blob_from_spark.py # Tried with com.microsoft.azure:azure

How to install PYODBC in Databricks

限于喜欢 提交于 2020-01-28 12:31:31
问题 I have to install pyodbc module in Databricks. I have tried using this command ( pip install pyodbc ) but it is failed due to below error. Error message 回答1: I had some problems a while back with connecting using pyobdc, details of my fix are here: https://datathirst.net/blog/2018/10/12/executing-sql-server-stored-procedures-on-databricks-pyspark I think the problem stems from PYTHONPATH on the databricks clusters being set to the Python 2 install. I suspect the lines: %sh apt-get -y install