问题
I am running a loop in bash script and passing png files to tesseract to read the text of image files. If output of the tesseract ocr shows Empty page!!
or nothing then I want the loop to proceed to next image. If it does include text then I want to store the output in a text file.
This is what my basic script looks like,
for i in {1..100}
do
tesseract file-${i}.png stdout >> result.txt
done
回答1:
This is roughly what you need. I took the liberty to do an "ls" to list png files in a directory, rather than iterating from 1 to 100:
for file in /my/directory/*.png
do
# Redirect output to a variable. This works even if output is multiline.
output="$(tesseract "$file" stdout)"
if [ -n "$output" ] && [ "$output" != "Empty page!!" ]
then
echo "$output" >> result.txt
fi
done
This is a bit rough, you may want to check result codes from tesseract in case there are errors, or you may want to omit standard error messages, things like that. But this should give you an idea.
来源:https://stackoverflow.com/questions/62462879/how-do-i-check-for-output-of-tesseract-in-bash-script