Why is pytesseract causing AttributeError: 'NoneType' object has no attribute 'bands'?

问题

I am trying to get started using pytesseract but as you can see below I am having problems.

I have found people getting what seems to be the same error and they say that it is a bug in PIL 1.1.7. Others say the problem is caused by PIL being lazy and one needs to force PIL to load the image with im.load() after opening it, but that didn't seem to help. Any suggestions gratefully received.

K:\Glamdring\Projects\Images\OCR>python
Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image
>>> import pytesseract
>>> pytesseract.image_to_string(Image.open('foo.png'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 143, in image_to_string
  File "c:\Python27_32\lib\site-packages\PIL\Image.py", line 1497, in split
    if self.im.bands == 1:
AttributeError: 'NoneType' object has no attribute 'bands'

回答1:

Try to use objects from Image and pytesseract module separately.
It solved my problem:

try:
    import Image
except ImportError:
    from PIL import Image
import pytesseract

img = Image.open('myImage.jpg')
img.load()
i = pytesseract.image_to_string(img)
print i

回答2:

I have no prior experience with PIL but I was bored so I tried to look into it and, from what I can tell, it is probably a bug.

This isn't a fault of pytesseract if we look at the execution steps.

Initially your Image.open('foo.png') works perfectly fine with no errors relating to your stack-trace.
pytesseract.image_to_string(img) comes in afterwards and does the following:
```
# Omitting the rest of the method.

# calls method split() of your image object.
if len(image.split()) == 4:
```
This is the first statement acting on image so we know we have to look back into PIL to find the root of the problem.

Your stacktrace has the specific message AttributeError: 'NoneType' object has no attribute 'bands' with regard to the if self.im.bands statement. This means that im is the object = None.

Lets look into the image.split() method:

"""
Split this image into individual bands. This method returns a
tuple of individual image bands from an image. For example,
splitting an "RGB" image creates three new images each
containing a copy of one of the original bands (red, green,
blue).

:returns: A tuple containing bands.
"""

self.load() # This is the culprit since..
if self.im.bands == 1: # .. here the im attribute of the image = None
    ims = [self.copy()]

# Omitting the rest ---

Obviously self.load() sets, among others, the im value. I verified this with a test Image and it seemed to work with no issues [I suggest you try the same with your image]:

In [7]: print img.im
None

In [8]: img.load()
Out[8]: <PixelAccess at 0x7fe03ab6a210>

In [9]: print img.im
<ImagingCore object at 0x7fe03ab6a1d0>

Let's now take a look in load(): I don't generally have the knowledge to know the internals here but I did observe something iffy: many FIXME comments before the assignment of im, specifically:

# -- Omitting rest --         

# FIXME: on Unix, use PROT_READ etc
self.map = mmap.mmap(file.fileno(), size)
self.im = Image.core.map_buffer(
                    self.map, self.size, d, e, o, a
                    )

# -- Omitting rest --

if hasattr(self, "tile_post_rotate"):
    # FIXME: This is a hack to handle rotated PCD's
    self.im = self.im.rotate(self.tile_post_rotate)
    self.size = self.im.size

This might be an indication that there might be some issues needing attention here. I can't be 100% certain though.

Of course, this might be caused by your image for some reason. The load() method worked fine with an image I supplied (and pytesseract just gave me a different error :P). You're better off probably creating a new issue for this. If any PIL experts happen to see this, enlighten us if you can.

回答3:

im.load() worked for me on running program in administrator mode and also add this line if you don't have tesseract executable in your PATH

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'

If you have already read an image(not using im.load() but with imread()) or frame from video and did some image processing stuff(may be not) on that variable(image) then you need to give the following command pytesseract.image_to_string(Image.fromarray(image))

回答4:

As @J_Mascis said,using objects worked here too-

    import pytesseract
    from PIL import Image
    img = Image.open('im.jpg')
    img.load()

    print(pytesseract.image_to_string(img, lang='eng'))#'eng' for english

来源：https://stackoverflow.com/questions/32791563/why-is-pytesseract-causing-attributeerror-nonetype-object-has-no-attribute-b

标签

python-2.7

python-imaging-library

ocr