Can't install textract on windows

后端 未结 4 2059
轻奢々
轻奢々 2020-12-06 07:28

I\'ve tried lots of things but still fail when I\'m trying to install textract package on my Windows by using pip command.

I\'m getting the following error:

相关标签:
4条回答
  • 2020-12-06 07:37

    The solution is much simpler now that the project appears to have been taken over by another individual (recently started updating the project again as of 3 months ago when I wrote this answer.)

    You can now go to https://github.com/deanmalmgren/textract/releases and download v1.6.2 which provides only requirement updates over v1.6.1 (fixing the unicode debug error) or v1.6.3 which is the latest (as of writing this.)

    Once downloaded, extract, cd [folder extracted to] and pip install .

    Just keep in mind there is always the concern that as requirements are updated malicious code can be inserted into dependencies and update this at your own risk.

    0 讨论(0)
  • 2020-12-06 07:38

    (Windows 10, Python 3.7) I had more issues than others, but this builds off of previous answers :

    1. Make sure that Microsoft Visual Studio C++ Compiler for Python is installed

      • For Visual Studio C++ 14.0 (also required by Scrapy as of June 2019), use : https://wiki.python.org/moin/WindowsCompilers -->
        https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2017 --> https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=Community&rel=16 Note : This may take a very long time to install, so be patient
    2. python -m pip install --upgrade pip setuptools wheel

    3. pip install six --upgrade

    4. Download EbookLib version 0.15:

      • Unzip the .zip file To avoid encoding errors, edit the "long_description" variable assignment to be "long_description = open('README.md',encoding="utf-8").read(),"
    5. Download Swig:

      • http://www.swig.org/download.html
      • Unzip the .zip file
      • Copy the swig.exe file into the Python path : e.g. "C:\Users\username\AppData\Local\Programs\Python\Python37"
      • Copy the "typemaps" folder into the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib\typemaps" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
      • Copy the "*.swg" files to the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib*.swg" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
      • Copy the all swig python files to the python "Lib" folder : e.g. "C:\Program Files\swigwin-4.0.0\Lib\python*" --> "C:\Users\username\AppData\Local\Programs\Python\Python37\Lib\"
    6. cd into the unzipped Ebooklib folder from the prompt : e.g. C:> cd "C:\Users\username\Desktop\ebooklib-0.15"

    7. run the installation for EbookLib : pip install .

    8. run the textract installation : pip install textract

    The output should be :

    C:\Users\username\Desktop\ebooklib-0.15>pip install textract
    Collecting textract
    Requirement already satisfied: docx2txt==0.6 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.6)
    Requirement already satisfied: beautifulsoup4==4.5.3 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (4.5.3)
    Requirement already satisfied: EbookLib==0.15 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.15)
    Requirement already satisfied: xlrd==1.0.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.0.0)
    Requirement already satisfied: SpeechRecognition==3.6.3 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (3.6.3)
    Requirement already satisfied: six==1.10.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.10.0)
    Collecting pocketsphinx==0.1.3 (from textract)
      Using cached https://files.pythonhosted.org/packages/93/5f/a968e5d53d25e32deb78c3e169fd8612ecf53cc76e32cb40e19be35696af/pocketsphinx-0.1.3.tar.bz2
    Requirement already satisfied: chardet==2.3.0 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (2.3.0)
    Requirement already satisfied: argcomplete==1.8.2 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (1.8.2)
    Requirement already satisfied: python-pptx==0.6.5 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from textract) (0.6.5)
    Requirement already satisfied: lxml in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from EbookLib==0.15->textract) (4.3.3)
    Requirement already satisfied: XlsxWriter>=0.5.7 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from python-pptx==0.6.5->textract) (1.1.8)
    Requirement already satisfied: Pillow>=2.6.1 in c:\users\username\appdata\local\programs\python\python37\lib\site-packages (from python-pptx==0.6.5->textract) (6.0.0)
    Building wheels for collected packages: pocketsphinx
      Building wheel for pocketsphinx (setup.py) ... done
      Stored in directory: C:\Users\username\AppData\Local\pip\Cache\wheels\38\80\4f\ddc3e8c2b788f2c7f1d625ae870f6bafd3038ff04a3445a2f8
    Successfully built pocketsphinx
    Installing collected packages: pocketsphinx, textract
    Successfully installed pocketsphinx-0.1.3 textract-1.6.1
    
    C:\Users\username\Desktop\ebooklib-0.15>
    

    At the time of this writing, jsonschema will have conflicting dependencies with textract. The following errors also arose as I tried to figure out the proper installation :

    ERROR: requests 2.22.0 has requirement chardet<3.1.0,>=3.0.2, but you'll have chardet 2.3.0 which is incompatible.
    ERROR: camelot-py 0.7.2 has requirement chardet>=3.0.4, but you'll have chardet 2.3.0 which is incompatible.
    
    ERROR: Command "python setup.py egg_info" failed with error code 1 in C:\Users\username\AppData\Local\Temp\pip-install-msmb9od3\EbookLib\
        UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1671: character maps to <undefined>
    error: command 'C:\\Users\\username\\AppData\\Local\\Programs\\Python\\Python37\\swig.exe' failed with exit status 1
    
    ERROR: Failed building wheel for pocketsphinx
    error: command 'swig.exe' failed: No such file or directory
      (1) : Error: Unable to find 'swig.swg'
      (3) : Error: Unable to find 'python.swg'
    
    0 讨论(0)
  • 2020-12-06 08:00

    Stolen from here:

    Needed to first install swig from conda (miniconda)

    conda install swig
    

    Then downloaded the EbookLib 0.15 zip from the releases

    https://github.com/aerkalov/ebooklib/releases
    

    After unzipping it, I manually removed (I used notepad++) the unicode char in the README.md file. (unicode char is on Line 44)

    And then installed the module with pip.

    cd to_unzipped_folder_path_here
    pip install .
    

    And finally

    pip install textract
    
    0 讨论(0)
  • 2020-12-06 08:00

    Not the most elegant solution but it works!

    pip install git+https://github.com/jpweytjens/textract
    

    Thanks to jpweytjens

    0 讨论(0)
提交回复
热议问题