Key lookup and intersection slower with less keys and weird performance

被刻印的时光 ゝ 提交于 2019-12-11 15:13:08

问题


I've build some code to allow me to match a dictionary of dictionary containing hash as key, and another dict array with hashes, but I obtain unexpectedly high results, all compared data are hashed, it produce good matching results but I've a problem with performance I can't explain and I need to find an explanation.

Performance results are like below:

"field_1": {
        "communes_fr": 0.02382307815551758, <-- ** Here is the problem ** 
        "departements": 0.0023715062141418455,
        "diplome": 0.005338183879852295,
        "noms_fr": 0.24101004314422608, <-- **Here is the problem too**
        "prenoms_fr": 0.022109932899475097,
        "regions": 0.0029974632263183596,
        "university": 0.0022575488090515136
    },

'field_1' size is 5000 and i.e noms_fr have a fixed length of 400k+ records

"field_2": {
        "communes_fr": 0.008192524909973145, <- Normal execution time
        "departements": 0.0004412479400634766,
        "diplome": 0.0013647904396057128,
        "noms_fr": 0.10224210166931152, <- Normal execution time
        "prenoms_fr": 0.02780187177658081,
        "regions": 0.0007285723686218261,
        "university": 0.0007123699188232422

Here field_2 length is 5000 too I have the same performance problem with intersection algorithm or key_lookup

    var start_time =performance.now()
    var count = module.exports.intersectionn(sql_dict_array[table][column], hash_mongo_dict[field]).length

    taux = (count/mongo_distinct_count[field])*100;
    performance_by_mongo_field[field][table] = (performance.now() - start_time)/1000;

intersectionn: function ( A , B ) 
{
    var m = A.reduce(function(m, v) { m[v] = 1; return m; }, {});
    return B.filter(function(v) { return m[v]; });
},

Plus, when I reduce the size of the data in arrayB I get worse execution time...

I can't see any difference in field_1 and field_2 except field_1 has more duplicate, this problem is on this particular field, data are not relevant because I compare hashed values.

I'll highly appreciate any feedback on this particular behavior because the code is working properly and I think it's more about how optimisation works in JS and I need to explain this behavior

来源:https://stackoverflow.com/questions/57256876/key-lookup-and-intersection-slower-with-less-keys-and-weird-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!