azure-blob-storage

How to get a list of all folders in an container in Blob Storage?

笑着哭i 提交于 2020-02-24 04:13:23
问题 I am using Azure Blob Storage to store some of my files away. I have them categorized in different folders. So far I can get a list of all blobs in the container using this: public async Task<List<Uri>> GetFullBlobsAsync() { var blobList = await Container.ListBlobsSegmentedAsync(string.Empty, true, BlobListingDetails.None, int.MaxValue, null, null, null); return (from blob in blobList.Results where !blob.Uri.Segments.LastOrDefault().EndsWith("-thumb") select blob.Uri).ToList(); } But how can

How to get a list of all folders in an container in Blob Storage?

强颜欢笑 提交于 2020-02-24 04:13:21
问题 I am using Azure Blob Storage to store some of my files away. I have them categorized in different folders. So far I can get a list of all blobs in the container using this: public async Task<List<Uri>> GetFullBlobsAsync() { var blobList = await Container.ListBlobsSegmentedAsync(string.Empty, true, BlobListingDetails.None, int.MaxValue, null, null, null); return (from blob in blobList.Results where !blob.Uri.Segments.LastOrDefault().EndsWith("-thumb") select blob.Uri).ToList(); } But how can

Saving spark dataframe from azure databricks' notebook job to azure blob storage causes java.lang.NoSuchMethodError

a 夏天 提交于 2020-02-06 10:14:06
问题 I have created a simple job using notebook in azure databricks. I am trying to save a spark dataframe from notebook to azure blob storage. Attaching the sample code import traceback from pyspark.sql import SparkSession from pyspark.sql.types import StringType # Attached the spark submit command used # spark-submit --master local[1] --packages org.apache.hadoop:hadoop-azure:2.7.2, # com.microsoft.azure:azure-storage:3.1.0 ./write_to_blob_from_spark.py # Tried with com.microsoft.azure:azure

Saving spark dataframe from azure databricks' notebook job to azure blob storage causes java.lang.NoSuchMethodError

痞子三分冷 提交于 2020-02-06 10:04:40
问题 I have created a simple job using notebook in azure databricks. I am trying to save a spark dataframe from notebook to azure blob storage. Attaching the sample code import traceback from pyspark.sql import SparkSession from pyspark.sql.types import StringType # Attached the spark submit command used # spark-submit --master local[1] --packages org.apache.hadoop:hadoop-azure:2.7.2, # com.microsoft.azure:azure-storage:3.1.0 ./write_to_blob_from_spark.py # Tried with com.microsoft.azure:azure

Azure Databricks: Accessing Blob Storage Behind Firewall

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-24 12:53:48
问题 I am reading files on an Azure Blob Storage account (gen 2) from an Azure Databricks Notebook. Both services are in the same region (West Europe). Everything works fine, except when I add a firewall in front of the storage account. I have opted to allow "trusted Microsoft services": However, running the notebook now ends up with an access denied error: com.microsoft.azure.storage.StorageException: This request is not authorized to perform this operation. I tried to access the storage directly

NameError: name 'dbutils' is not defined in pyspark

时光毁灭记忆、已成空白 提交于 2020-01-24 10:48:47
问题 I am running a pyspark job in databricks cloud. I need to write some of the csv files to databricks filesystem (dbfs) as part of this job and also i need to use some of the dbutils native commands like, #mount azure blob to dbfs location dbutils.fs.mount (source="...",mount_point="/mnt/...",extra_configs="{key:value}") I am also trying to unmount once the files has been written to the mount directory. But, when i am using dbutils directly in the pyspark job it is failing with NameError: name

NameError: name 'dbutils' is not defined in pyspark

独自空忆成欢 提交于 2020-01-24 10:46:26
问题 I am running a pyspark job in databricks cloud. I need to write some of the csv files to databricks filesystem (dbfs) as part of this job and also i need to use some of the dbutils native commands like, #mount azure blob to dbfs location dbutils.fs.mount (source="...",mount_point="/mnt/...",extra_configs="{key:value}") I am also trying to unmount once the files has been written to the mount directory. But, when i am using dbutils directly in the pyspark job it is failing with NameError: name

Azure Blob Storage Indexer fails on images

旧时模样 提交于 2020-01-17 05:23:15
问题 I'm using Azure Search with a Blob Storage indexer. I'm seeing failures in the execution history:- [ { "key": null, "errorMessage": "Document 'https://mystorage.blob.core.windows.net/my-documents/Document/Repository/F/AD/LO/LO-min-0002-00.png' has unsupported content type 'image/png'" } ] Does this failure cause other documents (with supported content type) in the storage not to be indexed? 回答1: Yes, by default 1 failed document will stop indexing. You can increase that limit if you just have

How to store a spark DataFrame as CSV into Azure Blob Storage

无人久伴 提交于 2020-01-16 10:32:11
问题 I'm trying to store a Spark DataFrame as a CSV on Azure Blob Storage from a local Spark cluster First, I set the config with the Azure Account/Account Key (I'm not sure what is the proper config so I've set all those) sparkContext.getConf.set(s"fs.azure.account.key.${account}.blob.core.windows.net", accountKey) sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.dfs.core.windows.net", accountKey) sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.blob

How to store a spark DataFrame as CSV into Azure Blob Storage

我的未来我决定 提交于 2020-01-16 10:31:58
问题 I'm trying to store a Spark DataFrame as a CSV on Azure Blob Storage from a local Spark cluster First, I set the config with the Azure Account/Account Key (I'm not sure what is the proper config so I've set all those) sparkContext.getConf.set(s"fs.azure.account.key.${account}.blob.core.windows.net", accountKey) sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.dfs.core.windows.net", accountKey) sparkContext.hadoopConfiguration.set(s"fs.azure.account.key.${account}.blob