How do I make my implementation of greedy set cover faster?

前端 未结 2 471
清酒与你
清酒与你 2020-12-10 21:41

I came up with the following implementation for the Greedy Set Cover after much discussion regarding my original question here. From the help I received, I encoded the probl

2条回答
  •  孤街浪徒
    2020-12-10 22:32

    What sorts of times are you getting vs what you need? Surely most of the execution time is spent in c-level code finding set intersections, so there's not much optimization you can do? With some random data (results may vary with your data of course, not sure if these are good values) of 100000 sets, 40 elements in each set, 500 unique elements, weights random from 1 to 10,

    print 'generating test data'    
    num_sets = 100000
    set_size = 40
    elements = range(500)
    U = set(elements)
    R = U
    S = []
    for i in range(num_sets):
        random.shuffle(elements)
        S.append(set(elements[:set_size]))
    w = [random.randint(1,100) for i in xrange(100)]
    
    C = []
    costs = []
    

    I got performance like this with cProfile:

             8200209 function calls in 14.391 CPU seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000   14.391   14.391 :1()
           41    4.802    0.117   14.389    0.351 test.py:23(findMin)
            1    0.001    0.001   14.391   14.391 test.py:40(func)
      4100042    0.428    0.000    0.428    0.000 {len}
           82    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
           41    0.001    0.000    0.001    0.000 {method 'difference' of 'set' objects}
            1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      4100000    9.160    0.000    9.160    0.000 {method 'intersection' of 'set' objects}
    

    Hm, so mostly apparently 1/3 of the time isn't in set intersections. But I personally wouldn't optimize any more, especially at the cost of clarity. There's not going to be much you can do with the other 2/3, so why bother?

提交回复
热议问题