A Faster Nested Tuple to List and Back

三世轮回 提交于 2019-12-05 23:52:42

I have been working with coroutines quiet a lot lately. The advantage would be that you reduce the overhead of the method calls. Sending a new value into a coroutine is faster than calling a function. While you can not make a recursive coroutine, it will throw a ValueError: generator already executing but you could make a pool of coroutine workers - you need one worker for every level of the tree. I have made some test code that works, but have not looked at the timing issues yet.

def coroutine(func):
    """ A helper function decorator from Beazley"""
    def start(*args, **kwargs):
        g = func(*args, **kwargs)
        g.next()
        return g
    return start

@coroutine
def cotuple2list():
    """This does the work"""
    result = None
    while True:
        (tup, co_pool) = (yield result)
        result = list(tup)
        # I don't like using append. So I am changing the data in place.
        for (i,x) in enumerate(result):
            # consider using "if hasattr(x,'__iter__')"
            if isinstance(x,tuple):
                result[i] = co_pool[0].send((x, co_pool[1:]))


@coroutine
def colist2tuple():
    """This does the work"""
    result = None
    while True:
        (lst, co_pool) = (yield result)
        # I don't like using append so I am changing the data in place...
        for (i,x) in enumerate(lst):
            # consider using "if hasattr(x,'__iter__')"
            if isinstance(x,list):
                lst[i] = co_pool[0].send((x, co_pool[1:]))
        result = tuple(lst)

Pure python alternative from HYRY's post:

def list2tuple(a):
    return tuple((list2tuple(x) if isinstance(x, list) else x for x in a))
def tuple2list(a):
    return list((tuple2list(x) if isinstance(x, tuple) else x for x in a))

Make a pool of coroutines - this is a hack of a pool, but it works:

# Make Coroutine Pools
colist2tuple_pool = [colist2tuple() for i in xrange(20) ]
cotuple2list_pool = [cotuple2list() for i in xrange(20) ]

Now do some timing - comparing to :

def make_test(m, n):
    # Test data function taken from HYRY's post!
    return [[range(m), make_test(m, n-1)] for i in range(n)]
import timeit
t = make_test(20, 8)
%timeit list2tuple(t)
%timeit colist2tuple_pool[0].send((t, colist2tuple_pool[1:]))

Results - notice the 'u' next to the 's' in the second line :-)

1 loops, best of 3: 1.32 s per loop
1 loops, best of 3: 4.05 us per loop

Really seems too fast to believe. Anybody know if timeit works with coroutines? Here is the old fashioned way:

tic = time.time()
t1 = colist2tuple_pool[0].send((t, colist2tuple_pool[1:]))
toc = time.time()
print toc - tic

result:

0.000446081161499

Newer versions of Ipython and %timit give a warning:

The slowest run took 9.04 times longer than the fastest. This could
mean that an intermediate result is being cached 1000000 loops, best of 3: 317 ns per loop

After some further investigation, python generators are not magic and send is still a function call. The reason my generator based method appeared to be faster is that I was doing an inplace operation on the lists - which resulted in fewer function calls.

I wrote all this out with lots of additional detail in a recent talk.

Hope this helps someone looking to play with generators.

Define two function separately:

def list2tuple(a):
    return tuple((list2tuple(x) if isinstance(x, list) else x for x in a))

def tuple2list(a):
    return list((tuple2list(x) if isinstance(x, tuple) else x for x in a))

some test:

t = [1, 2, [3, 4], [5, [7, 8]], 9]
t2 = list2tuple(t)
t3 = tuple2list(t2)
print t2
print t3

The results:

(1, 2, (3, 4), (5, (7, 8)), 9)
[1, 2, [3, 4], [5, [7, 8]], 9]

Edit: for fast version:

def list2tuple2(a, tuple=tuple, type=type, list=list):
    return tuple([list2tuple2(x) if type(x)==list else x for x in a])

def tuple2list2(a, tuple=tuple, type=type):
    return [tuple2list2(x) if type(x)==tuple else x for x in a]

To compare I also include cython version:

%%cython

def list2tuple3(a):
    return tuple([list2tuple3(x) if type(x)==list else x for x in a])

def tuple2list3(a):
    return [tuple2list3(x) if type(x)==tuple else x for x in a]

Create some nested list:

def make_test(m, n):
    return [[range(m), make_test(m, n-1)] for i in range(n)]

t = make_test(20, 8)
t2 = list2tuple2(t)

Then compare the speed:

%timeit listify(t, list, tuple)
%timeit listify(t2, tuple, list)
%timeit list2tuple(t)
%timeit tuple2list(t2)
%timeit list2tuple2(t)
%timeit tuple2list2(t2)
%timeit list2tuple3(t)
%timeit tuple2list3(t2)

The result is:

listify
1 loops, best of 3: 828 ms per loop
1 loops, best of 3: 912 ms per loop

list2tuple generator expression version
1 loops, best of 3: 1.49 s per loop
1 loops, best of 3: 1.67 s per loop

list2tuple2 list comprehension with local cache
1 loops, best of 3: 623 ms per loop
1 loops, best of 3: 566 ms per loop

list2tuple3 cython
1 loops, best of 3: 212 ms per loop
10 loops, best of 3: 232 ms per loop

Since the answers above do not deal with tuples or lists inside dictionary values, I post my own code:

def tuple2list(data):
    if isinstance(data, dict):
        return {
            key: tuple2list(value)
            for key, value in data.items()
        }
    elif isinstance(data, (list, tuple)):
        return [
            tuple2list(item)
            for item in data
        ]
    return data

def list2tuple(data):
    if isinstance(data, dict):
        return {
            key: list2tuple(value)
            for key, value in data.items()
        }
    elif isinstance(data, (list, tuple)):
        return tuple(
            list2tuple(item)
            for item in data
        )
    return data
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!