Databricks cluster does not initialize Azure library with error: module 'lib' has no attribute 'SSL_ST_INIT'

佐手、 提交于 2019-12-08 05:06:50

问题


I am using Azure DataBricks notebook with Azure library to get list of files in Blob Storage. This task is scheduled and cluster is terminated after finishing the job and started again with new run.

I am using Azure 4.0.0 library (https://pypi.org/project/azure/)

Sometimes I am getting error message:

  • AttributeError: module 'lib' has no attribute 'SSL_ST_INIT'

and very rarely also:

  • AttributeError: cffi library '_openssl' has no function, constant or global variable named 'CRYPTOGRAPHY_PACKAGE_VERSION'

I have found a solution as uninstall openssl or azure library, restart cluster and install it again, but restarting cluster may not be possible because it may need to handle longer tasks, etc.

I also tried to install/upgrade openSSL 16.2.0 in initialization script, but it does not help and start conflicting with some another openSSL library which is in Databricks cluster by default

Is there any option what I can do with it?

There is the code for getting list of files from Blob Storage:

import pandas as pd
import re
import os
from pyspark.sql.types import *
import azure
from azure.storage.blob import BlockBlobService
import datetime
import time

r = []
marker = None
blobService = BlockBlobService(accountName,accountKey)
while True:
  result = blobService.list_blobs(sourceStorageContainer, prefix = inputFolder, marker=marker)
  for b in result.items:
    r.append(b.name)
  if result.next_marker:
    marker = result.next_marker
  else:
    break
print(r)

Thank you


回答1:


Solution for this issue is downgrade Azure library to 3.0.0.

It looks like Azure v4 has conflicts with some initial libraries in Databricks.




回答2:


This issue also has a link with the pyOpenSSL package too. Downgrading to version 18.0.0 did the trick for me. I used the below script as init script at cluster initilization

dbutils.fs.put("/databricks/script/pyOpenSSL-install.sh",""" 
#!/bin/bash 
/databricks/python/bin/pip uninstall pyOpenSSL -y 
/databricks/python/bin/pip install pyOpenSSL==18.0.0 
""", True)


来源:https://stackoverflow.com/questions/54984230/databricks-cluster-does-not-initialize-azure-library-with-error-module-lib-ha

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!