How do I check for output of tesseract in bash script?

问题

I am running a loop in bash script and passing png files to tesseract to read the text of image files. If output of the tesseract ocr shows Empty page!! or nothing then I want the loop to proceed to next image. If it does include text then I want to store the output in a text file.

This is what my basic script looks like,

for i in {1..100}
do
tesseract file-${i}.png stdout >> result.txt
done

回答1:

This is roughly what you need. I took the liberty to do an "ls" to list png files in a directory, rather than iterating from 1 to 100:

for file in /my/directory/*.png
do
  # Redirect output to a variable. This works even if output is multiline.
  output="$(tesseract "$file" stdout)"
  
  if [ -n "$output" ] && [ "$output" != "Empty page!!" ]
  then
    echo "$output" >> result.txt
  fi
done

This is a bit rough, you may want to check result codes from tesseract in case there are errors, or you may want to omit standard error messages, things like that. But this should give you an idea.

来源：https://stackoverflow.com/questions/62462879/how-do-i-check-for-output-of-tesseract-in-bash-script

标签

Linux

bash

Ubuntu

tesseract

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!