spark-shell error : No FileSystem for scheme: wasb

怎甘沉沦 提交于 2019-11-30 15:57:07

Another way of setting Azure Storage (wasb and wasbs files) in spark-shell is:

  1. Copy azure-storage and hadoop-azure jars in the ./jars directory of spark installation.
  2. Run the spark-shell with the parameters —jars [a comma separated list with routes to those jars] Example:

    
    $ bin/spark-shell --master "local[*]" --jars jars/hadoop-azure-2.7.0.jar,jars/azure-storage-2.0.0.jar
    
  3. Add the following lines to the Spark Context:

    
    sc.hadoopConfiguration.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
    sc.hadoopConfiguration.set("fs.azure.account.key.my_account.blob.core.windows.net", "my_key")
    
  4. Run a simple query:

    
    sc.textFile("wasb://my_container@my_account_host/myfile.txt").count()
    
  5. Enjoy :)

With this settings you could easily could setup a Spark application, passing the parameters to the 'hadoopConfiguration' on the current Spark Context

Hai Ning from Microsoft has written an excellent blog post on to setup wasb on an apache hadoop installation.

Here is the summary:

  1. Add hadoop-azure-*.jar and azure-storage-*.jar to hadoop classpath

    1.1 Find the jars in your local installation. It's at /usr/hdp/current/hadoop-client folder on HDInsight cluster.

    1.2 Update HADOOP_CLASSPATH variable at hadoop-env.sh. Use exact jar name as java classpath doesn't support partial wildcard.

  2. Update core-site.xml

    <property>         
            <name>fs.AbstractFileSystem.wasb.Impl</name>                           
            <value>org.apache.hadoop.fs.azure.Wasb</value> 
    </property>
    
    <property>
            <name>fs.azure.account.key.my_blob_account_name.blob.core.windows.net</name> 
            <value>my_blob_account_key</value> 
    </property>
    
    <!-- optionally set the default file system to a container --> 
    <property>
            <name>fs.defaultFS</name>          
            <value>wasb://my_container_name@my_blob_account_name.blob.core.windows.net</value>
    </property>
    

See exact steps here: https://github.com/hning86/articles/blob/master/hadoopAndWasb.md

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!