I have a large Excel(xlsx and xls) file with multiple sheet and I need convert it to RDD or Dataframe so that it can be joined to othe
Excel(xlsx and xls)
RDD
Dataframe
Alternatively, you can use the HadoopOffice library (https://github.com/ZuInnoTe/hadoopoffice/wiki), which supports also encrypted Excel documents and linked workbooks, amongst other features. Of course Spark is also supported.