indices of sublist, except one

后端未结

关注

 0  1091

I am trying to calculate Q learning values following the equation:

$$Q_{k + 1}(s, a) \\leftarrow \\sum_{s\'} P(s, a, s\')[ R(s, a, s\') + \\gamma \\max_{a\'} Q_k(s\',