python setting limit for running time with while loop

六月ゝ 毕业季﹏ 提交于 2019-12-14 00:40:59

问题


I have some questions related to setting the maximum running time in python. In fact, I would like to use pdfminer to convert the pdf files to .txt. The problem is that very often, some files are not possible to decode and take extremely long time. So I want to set time.time() to limit the conversion time for each file to 20 seconds. In addition, I run under windows so I cannot use signal function.

I succeeded in running the conversion code with pdfminer.convert_pdf_to_txt() (in my code it is "c"), but I could not integrate the time.time() in the while loop. It seems to me that in the following code, the while loop and time.time() do not work.

In summary, I want to:

  1. convert the pdf to txt

  2. time limit for each conversion is 20 sec, if it runs out of time, throw an excepetion and save an empty file

  3. save all the txt files under the same folder

  4. if there are any exceptions/errors, still save the file but with empty content.

Here is the current code:

import converter as c
import os
import timeit
import time

yourpath = 'D:/hh/'



for root, dirs, files in os.walk(yourpath, topdown=False):


 for name in files:

           t_end = time.time() +20

           try: 

             while time.time() < t_end:

               c.convert_pdf_to_txt(os.path.join(root, name))


               t=os.path.split(os.path.dirname(os.path.join(root, name)))[1]
               a=str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

               g=str(a.split("\\")[1])
               with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))
                print "yes"


             if time.time() > t_end:

                print "no"

                with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                  newfile.write("")

           except KeyboardInterrupt:
              raise

           except:
              for name in files:
                t=os.path.split(os.path.dirname(os.path.join(root, name)))[1]
                a=str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

                g=str(a.split("\\")[1])
                with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                  newfile.write("")

回答1:


You have the wrong approach.

What you do is defining the end time and immediately entering the while loop if the current timestamp is lower than the end timestamp (will be always True). So the while loop is entered and you get stuck at the converting function.

I would suggest the signal module, which is already included in Python. It allows you to quit a function after n seconds. A basic example can be seen in this StackOverflow anser.

Your code would be like this:

return astring
import converter as c
import os
import timeit
import time
import threading
import thread

yourpath = 'D:/hh/'

for root, dirs, files in os.walk(yourpath, topdown=False):
   for name in files:
       try:
           timer = threading.Timer(5.0, thread.interrupt_main)
           try:
               c.convert_pdf_to_txt(os.path.join(root, name))
           except KeyboardInterrupt:
                print("no")

                with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                    newfile.write("")
           else:
               timer.cancel()
               t=os.path.split(os.path.dirname(os.path.join(root, name)))[1]
               a=str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
               g=str(a.split("\\")[1])

               print("yes")

               with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                    newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))

       except KeyboardInterrupt:
          raise

       except:
           for name in files:
               t=os.path.split(os.path.dirname(os.path.join(root, name)))[1]
               a=str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

               g=str(a.split("\\")[1])
               with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                   newfile.write("")

I really hope this helps. If you have issues in understanding the code changes, feel free to ask.


Just for the future: 4 spaces indentation and not too much whitespace ;)



来源:https://stackoverflow.com/questions/40744250/python-setting-limit-for-running-time-with-while-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!