Can this loop be sped up in pure Python?

倖福魔咒の 提交于 2019-12-11 06:49:46

问题


I was trying out an experiment with Python, trying to find out how many times it could add one to an integer in one minute's time. Assuming two computers are the same except for the speed of the CPUs, this should give an estimate of how fast some CPU operations may take for the computer in question.

The code below is an example of a test designed to fulfill the requirements given above. This version is about 20% faster than the first attempt and 150% faster than the third attempt. Can anyone make any suggestions as to how to get the most additions in a minute's time span? Higher numbers are desireable.

EDIT 1: This experiment is being written in Python 3.1 and is 15% faster than the fourth speed-up attempt.

def start(seconds):
    import time, _thread
    def stop(seconds, signal):
        time.sleep(seconds)
        signal.pop()
    total, signal = 0, [None]
    _thread.start_new_thread(stop, (seconds, signal))
    while signal:
        total += 1
    return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))

EDIT 2: Regarding using True instead of 1 in the while loop: there should be no speed difference. The following experiment proves that they are the same. First, create a file named main.py and copy the following code into it.

def test1():
    total = 0
    while 1:
        total += 1

def test2():
    total = 0
    while True:
        total += 1

if __name__ == '__main__':
    import dis, main
    dis.dis(main)

Running the code should produce the following output that shows how the code was actually compiled and what the generated Python Virtual Machine Instructions turned out to be.

Disassembly of test1:
  2           0 LOAD_CONST               1 (0) 
              3 STORE_FAST               0 (total) 

  3           6 SETUP_LOOP              13 (to 22) 

  4     >>    9 LOAD_FAST                0 (total) 
             12 LOAD_CONST               2 (1) 
             15 INPLACE_ADD          
             16 STORE_FAST               0 (total) 
             19 JUMP_ABSOLUTE            9 
        >>   22 LOAD_CONST               0 (None) 
             25 RETURN_VALUE         

Disassembly of test2:
  7           0 LOAD_CONST               1 (0) 
              3 STORE_FAST               0 (total) 

  8           6 SETUP_LOOP              13 (to 22) 

  9     >>    9 LOAD_FAST                0 (total) 
             12 LOAD_CONST               2 (1) 
             15 INPLACE_ADD          
             16 STORE_FAST               0 (total) 
             19 JUMP_ABSOLUTE            9 
        >>   22 LOAD_CONST               0 (None) 
             25 RETURN_VALUE         

The emitted PVMIs (byte codes) are exactly the same, so both loops should run without any difference in speed.


回答1:


About a 20-25% improvement, FWIW - but like others, I'd propose that Python incrementing integers probably isn't the best benchmarking tool.

def start(seconds):
    import time, _thread
    def stop(seconds):
        time.sleep(seconds)
        _thread.interrupt_main()
    total = 0
    _thread.start_new_thread(stop, (seconds,))
    try:
        while True:
            total += 1
    except:
        return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))



回答2:


I see almost the same but consistently better (~2%) results than the @Amber's one on my machine on Python 3.1.2 for the code:

import signal

class Alarm(Exception):
    pass

def alarm_handler(signum, frame):
    raise Alarm

def jfs_signal(seconds):
    # set signal handler
    signal.signal(signal.SIGALRM, alarm_handler)
    # raise Alarm in `seconds` seconds
    signal.alarm(seconds)

    total = 0
    try:
        while 1:
            total += 1
    finally:
        signal.alarm(0) # disable the alarm
        return total

Here's variant that uses subprocess module to run interruptible loop:

#!/usr/bin/env python
# save it as `skytower.py` file
import atexit
import os
import signal
import subprocess
import sys
import tempfile
import time

def loop():
    @atexit.register
    def print_total():
        print(total)

    total = 0
    while 1:
        total += 1

def jfs_subprocess(seconds):
    # start process, redirect stdout/stderr
    f = tempfile.TemporaryFile() 
    p = subprocess.Popen([sys.executable, "-c",
                          "from skytower import loop; loop()"],
                         stdout=f, stderr=open(os.devnull, 'wb'))
    # wait 
    time.sleep(seconds)

    # raise KeyboardInterrupt
    #NOTE: if it doesn't kill the process then `p.wait()` blocks forever
    p.send_signal(signal.SIGINT) 
    p.wait() # wait for the process to terminate otherwise the output
             # might be garbled

    # return saved output
    f.seek(0) # rewind to the beginning of the file
    d = int(f.read())
    f.close()
    return d

if __name__ == '__main__':
    print('total:', jfs_subprocess(60))

It is ~20% slower than the signal.alarm()'s variant on my machine.




回答3:


This exercise on learning more about Python and computers was satisfying. This is the final program:

def start(seconds, total=0):
    import _thread, time
    def stop():
        time.sleep(seconds)
        _thread.interrupt_main()
    _thread.start_new_thread(stop, ())
    try:
        while True:
            total += 1
    except KeyboardInterrupt:
        return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))

Running it on Windows 7 Professional with a 2.16 GHz CPU produced the following output within IDLE:

Python 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] 
on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
Testing the CPU speed ...
Relative speed: 673991388
>>> 

Edit: The code up above only runs on one core. The following program was written to fix that problem.

#! /usr/bin/env python3

def main(seconds):
    from multiprocessing import cpu_count, Barrier, SimpleQueue, Process
    def get_all(queue):
        while not queue.empty():
            yield queue.get()
    args = seconds, Barrier(cpu_count()), SimpleQueue()
    processes = [Process(target=run, args=args) for _ in range(cpu_count())]
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    print('Relative speed:', sorted(get_all(args[-1]), reverse=True))

def run(seconds, barrier, queue):
    from time import sleep
    from _thread import interrupt_main, start_new_thread
    def terminate():
        sleep(seconds)
        interrupt_main()
    total = 0
    barrier.wait()
    start_new_thread(terminate, ())
    try:
        while True:
            total += 1
    except KeyboardInterrupt:
        queue.put(total)

if __name__ == '__main__':
    main(60)


来源:https://stackoverflow.com/questions/4676544/can-this-loop-be-sped-up-in-pure-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!