conditionally replace values in one list using another list of different length and ranges based on %age overlap in python

后端 未结 3 454
予麋鹿
予麋鹿 2021-01-14 17:47

One text file \'Truth\' contains these following values :

0.000000    3.810000    Three
3.810000    3.910923    NNNN
3.910923    5.429000    AAAA
5.429000            


        
3条回答
  •  刺人心
    刺人心 (楼主)
    2021-01-14 18:23

    This is "just" number crunching - here is one way:

    raw_test = [[0.000000   , 3.810000  ,  'Three'],
            [3.810000   , 3.910923  ,  'Three'],
            [3.910923   , 5.429000  ,  'AAAA '],
            [5.429000   , 7.060000  ,  'Three'],
            [7.060000   , 8.411000  ,  'Three'],
            [8.411000   , 8.971000  ,  'Zero'],
            [8.971000   , 13.40600  ,  'Three'],
            [13.40600   , 13.82700  ,  'Zero'], 
            [13.82700   , 15.935554 ,  'Two'], 
            [15.935554  , 20.138337 ,  'Two'],]
    
    raw_truth = [[0.000000 ,   1.00000   ,  'MMMM'],
       [1.000    ,   3.810000  ,  'Three'],
       [3.810000 ,   3.910923  ,  'NNNN'],
       [3.910923 ,   5.429000  ,  'AAAA'],
       [5.429000 ,   6.0000    ,  'MMMM'],
       [6.0000   ,   7.060000  ,  'AAAA'],
       [7.060000 ,   8.411000  ,  'MMMM'],
       [8.411000 ,   8.971000  ,  'MMMM'],
       [8.971000 ,   11.00     ,  'abcd'],
       [11.00    ,   13.40600  ,  'MMMM'],
       [13.40600 ,   13.82700  ,  'Zero'],
       [13.82700 ,   15.935554 ,  'One'],]
    
    truth = {}
    for mi,ma,key in raw_truth:
      truth.setdefault((mi,ma), key)
    
    test = [ (mi,ma,ma - mi,lab) for mi,ma,lab in raw_test ]
    
    overlap = []
    overlap.append(["test-min","test-max","test-size","test-lab",
                    "#","truth-min","truth-max","truth-lab",
                    "#","min-over","max-over","over-size","%"])
    
    for mi,ma,siz,lab in test:
      for key in truth:
        truMi,truMa = key
        truVal = truth[key]
    
        if  ma >= truMi and ma <=truMa or mi >= truMi and mi <=truMa: # coarse filter
          minOv = max(truMi,mi)
          maxOv = min(truMa,ma)
          sizOv = maxOv-minOv
          perc = sizOv/(siz/100.0)
          if perc > 0: # fine filter
            overlap.append([mi,ma,siz,lab,
                            '#',truMi,truMa,truVal,
                            '#',minOv,maxOv, sizOv, perc ])
    
    # just some printing:    
    print(truth)
    print()    
    
    print(test)
    print()    
    
    for d in overlap:
      for x in d:
        if type(x) is str:
          if x == '#':
            print( '  |  ', end ="")    
           else:
            print( '{:<10}'.format(x), end ="")  
        else:
          print( '{:<10.5f}'.format(x), end ="")
      print(" %")
    
    # the print statements are python3 - at the time this answer was written, the question
    # had no python 2 tag. Replace the python 3 print statements with
    #    print '  |  ',
    #    print '{:<10}'.format(x),  
    #    print '{:<10.5f}'.format(x),    
    # etc. or adapt them accordingly - see https://stackoverflow.com/a/2456292/7505395
    

    Output:

    test-min  test-max  test-size test-lab    |  truth-min truth-max truth-lab   |  min-over  max-over  over-size %          %
    0.00000   3.81000   3.81000   Three       |  0.00000   1.00000   MMMM        |  0.00000   1.00000   1.00000   26.24672   %
    0.00000   3.81000   3.81000   Three       |  1.00000   3.81000   Three       |  1.00000   3.81000   2.81000   73.75328   %
    3.81000   3.91092   0.10092   Three       |  3.81000   3.91092   NNNN        |  3.81000   3.91092   0.10092   100.00000  %
    3.91092   5.42900   1.51808   AAAA        |  3.91092   5.42900   AAAA        |  3.91092   5.42900   1.51808   100.00000  %
    5.42900   7.06000   1.63100   Three       |  5.42900   6.00000   MMMM        |  5.42900   6.00000   0.57100   35.00920   %
    5.42900   7.06000   1.63100   Three       |  6.00000   7.06000   AAAA        |  6.00000   7.06000   1.06000   64.99080   %
    7.06000   8.41100   1.35100   Three       |  7.06000   8.41100   MMMM        |  7.06000   8.41100   1.35100   100.00000  %
    8.41100   8.97100   0.56000   Zero        |  8.41100   8.97100   MMMM        |  8.41100   8.97100   0.56000   100.00000  %
    8.97100   13.40600  4.43500   Three       |  8.97100   11.00000  abcd        |  8.97100   11.00000  2.02900   45.74972   %
    8.97100   13.40600  4.43500   Three       |  11.00000  13.40600  MMMM        |  11.00000  13.40600  2.40600   54.25028   %
    13.40600  13.82700  0.42100   Zero        |  13.40600  13.82700  Zero        |  13.40600  13.82700  0.42100   100.00000  %
    13.82700  15.93555  2.10855   Two         |  13.82700  15.93555  One         |  13.82700  15.93555  2.10855   100.00000  %
    

    Disclaimer: I haven't number crunched everything by hand to check this is correct - just took a glance at the output. Verify it yourself. You would need to apply the truth-lab where ever your % fits.

提交回复
热议问题