问题
I am trying to get started using pytesseract but as you can see below I am having problems.
I have found people getting what seems to be the same error and they say that it is a bug in PIL 1.1.7. Others say the problem is caused by PIL being lazy and one needs to force PIL to load the image with im.load()
after opening it, but that didn't seem to help. Any suggestions gratefully received.
K:\Glamdring\Projects\Images\OCR>python
Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image
>>> import pytesseract
>>> pytesseract.image_to_string(Image.open('foo.png'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 143, in image_to_string
File "c:\Python27_32\lib\site-packages\PIL\Image.py", line 1497, in split
if self.im.bands == 1:
AttributeError: 'NoneType' object has no attribute 'bands'
回答1:
Try to use objects from Image and pytesseract module separately.
It solved my problem:
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
img = Image.open('myImage.jpg')
img.load()
i = pytesseract.image_to_string(img)
print i
回答2:
I have no prior experience with PIL
but I was bored so I tried to look into it and, from what I can tell, it is probably a bug.
This isn't a fault of pytesseract
if we look at the execution steps.
- Initially your
Image.open('foo.png')
works perfectly fine with no errors relating to your stack-trace. pytesseract.image_to_string(img)
comes in afterwards and does the following:# Omitting the rest of the method. # calls method split() of your image object. if len(image.split()) == 4:
This is the first statement acting on
image
so we know we have to look back intoPIL
to find the root of the problem.Your stacktrace has the specific message
AttributeError: 'NoneType' object has no attribute 'bands'
with regard to theif self.im.bands
statement. This means thatim
is theobject = None
.Lets look into the
image.split()
method:""" Split this image into individual bands. This method returns a tuple of individual image bands from an image. For example, splitting an "RGB" image creates three new images each containing a copy of one of the original bands (red, green, blue). :returns: A tuple containing bands. """ self.load() # This is the culprit since.. if self.im.bands == 1: # .. here the im attribute of the image = None ims = [self.copy()] # Omitting the rest ---
Obviously
self.load()
sets, among others, theim
value. I verified this with a test Image and it seemed to work with no issues [I suggest you try the same with your image]:In [7]: print img.im None In [8]: img.load() Out[8]: <PixelAccess at 0x7fe03ab6a210> In [9]: print img.im <ImagingCore object at 0x7fe03ab6a1d0>
Let's now take a look in
load()
: I don't generally have the knowledge to know the internals here but I did observe something iffy: many FIXME comments before the assignment ofim
, specifically:# -- Omitting rest -- # FIXME: on Unix, use PROT_READ etc self.map = mmap.mmap(file.fileno(), size) self.im = Image.core.map_buffer( self.map, self.size, d, e, o, a ) # -- Omitting rest -- if hasattr(self, "tile_post_rotate"): # FIXME: This is a hack to handle rotated PCD's self.im = self.im.rotate(self.tile_post_rotate) self.size = self.im.size
This might be an indication that there might be some issues needing attention here. I can't be 100% certain though.
Of course, this might be caused by your image for some reason. The load()
method worked fine with an image I supplied (and pytesseract
just gave me a different error :P). You're better off probably creating a new issue for this. If any PIL
experts happen to see this, enlighten us if you can.
回答3:
im.load()
worked for me on running program in administrator mode and also add this line if you don't have tesseract executable in your PATH
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'
If you have already read an image(not using im.load() but with imread()) or frame from video and did some image processing stuff(may be not) on that variable(image) then you need to give the following command pytesseract.image_to_string(Image.fromarray(image))
回答4:
As @J_Mascis said,using objects worked here too-
import pytesseract
from PIL import Image
img = Image.open('im.jpg')
img.load()
print(pytesseract.image_to_string(img, lang='eng'))#'eng' for english
来源:https://stackoverflow.com/questions/32791563/why-is-pytesseract-causing-attributeerror-nonetype-object-has-no-attribute-b