问题
I discovered cython yesterday and a couple of cdef for my numpy structures improved execution time by 50% making my 24 hr run just 12 hours. Incredible!
I have a python dictionary with an integer key, containing a dictionary with 2 fixed keys each containing a float. This structure is accessed a crazy number of times. Hopping that cython can speed this up as well.
cython documentation talks about c++ map and makes reference to map.pxd. I have zero experience in c++, but it appears cython will be able to speed up my code, I just need a little help.
At the start of my loop:
# This dictionary will hold at maximum 100*1000
# key is an int between 0 and 200000,
# index_0_start is defined above as "cdef int index_0_start"
# value is an object/struct containing 2 floats
t_minus_1 = {index_0_start: {"pb": 1.0, "pnb": 0.0}}
Many checks if key exists:
# s is an integer, cdef int s
if not s in time_t:
time_t[s] = {"pb": 0, "pnb": 0}
Many accesses for calculations
# s is an integer, mv_ctc is a memory view, twk is a float
time_t[s]["pb"] = time_t[s]["pb"] + mv_ctc[t,blank_index] * twk * ( t_minus_1[s]["pb"] + t_minus_1[s]["pnb"] )
Currently sorting the above dictionaries and selecting the best k entries with the python code. 100 times less than above, but still a crazy number of times. Hoping c++/cython can simplify this with a sortable map plus delete entries bellow k.
# calculate k most probable hypothesis
prob_hythotesis = []
for s in time_t:
p = ( time_t[s]["pb"] + time_t[s]["pnb"] )*length_in_graph[s]**beta
prob_hythotesis.append((s, p))
prob_hythotesis.sort(key = lambda x: x[1], reverse=True)
# shift state to time t-1
t_minus_1 = {}
for sp in prob_hythotesis[0: min(k+1, len(prob_hythotesis))]:
s = sp[0]
t_minus_1[s]=time_t[s]
The closest stackover response is 5 years old and different enough that my zero c++ experience is leaving me lost. Any help is appreciated.
来源:https://stackoverflow.com/questions/60472223/cython-map-with-int-k-with-value-a-struct