问题
I'd like to automatically clean up visible borders/shadows in scanned pages.
My idea for doing this is simple: detect a largest rectangle in the image in which all pixels are white or nearly white, then crop the image to that rectangle or floodfill the exterior with white.
I can write my own program for finding such a rectangle, but I'd prefer to use ImageMagick (which can also do the cropping or floodfilling), netpbm, or other utilities readily available for Linux and Cygwin.
Can they do this? How?
PS: I just found a very similar question. If the answer there works for me, this will be a duplicate.
回答1:
convert has filters that you can apply before doing the autocrop. I have an example here:
http://www.alexiswilke.me/blog/learning-more-about-convert-imagemagick
So use something like:
convert <in-image> -level 20%,80%,1.0 <out-image>
This will make dark areas pitch black and white areas full white.
Next you want to compare the image line by line at the top to find how many lines to remove from the top. This is done with the compare tool (which you could also use to apply the "-level filters" while doing the compare, with the -fuzz for example.) I did not try closely, so I cannot give you the exact command line for that one...
http://www.imagemagick.org/script/compare.php
Once the compare process done, you should have the number of lines at the top, the number of lines at the bottom, on the left and on the right (if they don't test columns, think about rotating the image 90%.)
Finally, you have the geometry and you can apply the crop:
convert <in-image> -crop <width>x<height>+<xpos>+<ypos> <final-image>
Update:
Thinking about it, the -level
option of convert would work very well along the pnmcrop tool. That means you'd first do a convert, crop that converted image, search the location of the final image in the original, use that geometry to crop the original. A sinopsis would be something like this:
convert <original> -level 20%,80% <temp>
pnmcrop <temp>
compare <original> <temp>
convert <original> -crop ... <final>
Put that in a script and you've got your auto-crop with none pure colors around the image as mentioned.
Hmmm... Actually, the compare command would certainly work a lot better if we compare with the <temp> image.
convert <original> -level 20%,80% <tempA>
pnmcrop <tempA> <tempB>
compare <tempA> <tempB>
convert <original> -crop ... <final>
Not too sure about the exact pnmcrop
and compare
command line options, but think of it like this: <tempA> is written once by convert
(1st line) then used to generate <tempB> and then we search <tempB> inside of <tempA> to get a position and size (geometry) that we finally reuse for the crop command (last convert
.)
回答2:
I do this (my question is the similar one you link to) with a combination of ImageMagick and LSD.
Your mileage may vary with different settings being tweaked (in fact, my algorithm runs through this whole process several times with different settings and at different resolutions until one is deemed "good enough"), but the general strategy I have is this:
- Convert image with ImageMagick to black-and-white (just black pixels and white pixels, not a grayscale) PGM image.
- Produce an EPS image from the PGM image of just the edges of the page, using LSD with some very extreme parameters.
- Store the rotation angle of the EPS, as detected by ImageMagick with
deskew
. - Rotate the EPS with ImageMagick to make it straight. (My scanned images can be crooked.)
- Store the crop dimensions of the EPS using ImageMagick's
trim
. - Take the original scan, and rotate it the rotation angle and crop it to the crop dimensions using ImageMagick.
- If needed, use ImageMagick
morphology
to remove specks from poor-quality scans.
All of the params I use are fairly arbitrary/specific to my use case, but this is the general approach. Good luck!
来源:https://stackoverflow.com/questions/23299784/how-do-i-find-the-largest-nearly-white-rectangle-in-a-bitmap-with-imagemagick