如何在Python中使用线程?

强颜欢笑 提交于 2019-12-10 19:03:13

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>>

我试图了解Python中的线程。 我看过文档和示例,但是坦率地说,许多示例过于复杂,我难以理解它们。

您如何清楚地显示为多线程而划分的任务?


#1楼

对我而言,线程化的完美示例是监视异步事件。 看这段代码。

# thread_test.py
import threading
import time 

class Monitor(threading.Thread):
    def __init__(self, mon):
        threading.Thread.__init__(self)
        self.mon = mon

    def run(self):
        while True:
            if self.mon[0] == 2:
                print "Mon = 2"
                self.mon[0] = 3;

您可以通过打开IPython会话并执行以下操作来使用此代码:

>>>from thread_test import Monitor
>>>a = [0]
>>>mon = Monitor(a)
>>>mon.start()
>>>a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2

等一下

>>>a[0] = 2
Mon = 2

#2楼

仅需注意,线程不需要Queue。

这是我能想到的最简单的示例,其中显示了10个进程同时运行。

import threading
from random import randint
from time import sleep


def print_number(number):
    # Sleeps a random 1 to 10 seconds
    rand_int_var = randint(1, 10)
    sleep(rand_int_var)
    print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"

thread_list = []

for i in range(1, 10):
    # Instantiates the thread
    # (i) does not make a sequence, so (i,)
    t = threading.Thread(target=print_number, args=(i,))
    # Sticks the thread in a list so that it remains accessible
    thread_list.append(t)

# Starts threads
for thread in thread_list:
    thread.start()

# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
    thread.join()

# Demonstrates that the main process waited for threads to complete
print "Done"

#3楼

Alex Martelli的回答对我有所帮助,但是这里是我认为更有用的修改版本(至少对我而言)。

更新:在python2和python3中均可使用

try:
    # for python3
    import queue
    from urllib.request import urlopen
except:
    # for python2 
    import Queue as queue
    from urllib2 import urlopen

import threading

worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']

#load up a queue with your data, this will handle locking
q = queue.Queue()
for url in worker_data:
    q.put(url)

#define a worker function
def worker(url_queue):
    queue_full = True
    while queue_full:
        try:
            #get your data off the queue, and do some work
            url = url_queue.get(False)
            data = urlopen(url).read()
            print(len(data))

        except queue.Empty:
            queue_full = False

#create as many threads as you want
thread_count = 5
for i in range(thread_count):
    t = threading.Thread(target=worker, args = (q,))
    t.start()

#4楼

我发现这非常有用:创建与内核一样多的线程,并让它们执行(大量)任务(在这种情况下,调用Shell程序):

import Queue
import threading
import multiprocessing
import subprocess

q = Queue.Queue()
for i in range(30): #put 30 tasks in the queue
    q.put(i)

def worker():
    while True:
        item = q.get()
        #execute a task: call a shell program and wait until it completes
        subprocess.call("echo "+str(item), shell=True) 
        q.task_done()

cpus=multiprocessing.cpu_count() #detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
     t = threading.Thread(target=worker)
     t.daemon = True
     t.start()

q.join() #block until all tasks are done

#5楼

自从2010年提出这个问题以来,如何使用带有mappool的 python进行简单的多线程处理有了真正的简化。

下面的代码来自一篇文章/博客文章,您绝对应该检出(没有从属关系)- 并行显示在一行中:更好的日常线程任务模型 。 我将在下面进行总结-最终仅是几行代码:

from multiprocessing.dummy import Pool as ThreadPool 
pool = ThreadPool(4) 
results = pool.map(my_function, my_array)

这是以下内容的多线程版本:

results = []
for item in my_array:
    results.append(my_function(item))

描述

Map是一个很棒的小功能,是轻松将并行性注入Python代码的关键。 对于那些不熟悉的人,地图是从Lisp之类的功能语言中提炼出来的。 它是将另一个功能映射到序列上的功能。

Map为我们处理序列上的迭代,应用函数,并将所有结果存储在最后的方便列表中。


实作

map函数的并行版本由以下两个库提供:multiprocessing,以及鲜为人知但同样出色的step child:multiprocessing.dummy。

multiprocessing.dummy与多处理模块完全相同, 但是使用线程代替一个重要的区别 -使用多个进程来执行CPU密集型任务;使用IO(和IO期间) ):

multiprocessing.dummy复制了多处理的API,但仅不过是线程模块的包装器。

import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

urls = [
  'http://www.python.org', 
  'http://www.python.org/about/',
  'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
  'http://www.python.org/doc/',
  'http://www.python.org/download/',
  'http://www.python.org/getit/',
  'http://www.python.org/community/',
  'https://wiki.python.org/moin/',
]

# make the Pool of workers
pool = ThreadPool(4) 

# open the urls in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)

# close the pool and wait for the work to finish 
pool.close() 
pool.join() 

以及计时结果:

Single thread:   14.4 seconds
       4 Pool:   3.1 seconds
       8 Pool:   1.4 seconds
      13 Pool:   1.3 seconds

传递多个参数仅在Python 3.3和更高版本中才这样 ):

要传递多个数组:

results = pool.starmap(function, zip(list_a, list_b))

或传递一个常数和一个数组:

results = pool.starmap(function, zip(itertools.repeat(constant), list_a))

如果您使用的是Python的早期版本,则可以通过此替代方法传递多个参数。

(感谢user136036的有用评论)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!