Select x random elements from a weighted list in C# (without replacement)

前端 未结 4 833
旧巷少年郎
旧巷少年郎 2020-12-19 10:00

Update: my problem has been solved, I updated the code source in my question to match with Jason\'s answer. Note that rikitikitik answer is solving the issu

相关标签:
4条回答
  • 2020-12-19 10:15

    You could do this:

    Card GetCard(List<Card> cards)
    {
      int total = 0;
      foreach (Card c in cards)
      {
        total += AttributionRate;
      }
    
      int index = Random.Next(0, total - 1);
      foreach(Card c in cards)
      {
        index -= c.AttributionRate;
        if (index < 0)
        {
          return c;
        }
      }
    }
    
    Card PopCard(List<Card> cards)
    {
      Card c = GetCard(cards);
      cards.Remove(c);
    }
    

    In theory this should work.

    0 讨论(0)
  • 2020-12-19 10:16

    As some people have mentioned in the comments, create a list of the cards in the exact proportion that you want:

    var deck = new List<Card>();
    
    cards.ForEach(c => 
    {
        for(int i = 0; i < c.AttributionRate; i++)
        {
             deck.Add(c);
        }
    }
    

    Shuffle:

    deck = deck.OrderBy(c => Guid.NewGuid()).ToList();
    

    And pick x cards:

    var hand = deck.Take(x)
    

    Of course this only works if AttributionRate is an int. Otherwise, you would have to tinker with the deck generation a bit.

    I get the following results for 10,000 runs taking 5 at a time:

    Card 1: 9.932% 
    Card 2: 30.15% 
    Card 3: 49.854% 
    Card 4: 10.064% 
    

    Another result:

    Card 1: 10.024%
    Card 2: 30.034%
    Card 3: 50.034% 
    Card 4: 9.908% 
    

    EDIT:

    I've braved the bitwise operations and I have taken a look at your code. After adding a generous amount of barbecue sauce on my fried brain, I noticed a few things:

    First, Random.Next(min,max) will include min in the random pool, but not max. This is the reason for the higher than expected probability for Card 1.

    After doing that change, I implemented your code and it appears to be working when you draw 1 card.

    Card 1: 10.4%  
    Card 2: 32.2% 
    Card 3: 48.4% 
    Card 4: 9.0% 
    
    Card 1: 7.5%
    Card 2: 28.1%
    Card 3: 50.0% 
    Card 4: 14.4% 
    

    HOWEVER, your code will not work when you draw more than 1 card because of this statement:

    heap[i].Weight = 0;
    

    That line, and the recalculation loop after that, essentially removes all instances of the drawn card from the heap. If you happen to draw four cards, then the percentage becomes 25% for all cards since you're basically drawing all 4 cards. The algorithm, as it is, is not completely applicable to your case.

    I suspect you would have to recreate the heap every time you draw a card, but I doubt it would still perform as well. If I were to work on this though, I would just generate 4 distinct random numbers from 1 to heap[1].TotalWeight and get the 4 corresponding cards from there, although the random number generation in this case might become unpredictable (rerolling) and thus inefficient.

    0 讨论(0)
  • 2020-12-19 10:17

    There are two minor bugs in the program. First, the range of the random number should be exactly equal to the total weight of all the items:

    int gas = random.Next(heap[1].TotalWeight);
    

    Second, change both places where it says gas > to say gas >=.

    (The original Python code is OK because gas is a floating-point number, so the difference between > and >= is negligible. That code was written to accept either integer or floating-point weights.)

    Update: OK, you made the recommended changes in your code. I think that code is correct now!

    0 讨论(0)
  • 2020-12-19 10:30

    If you want to pick x elements from a weighted set without replacement such that elements are chosen with a probability proportional to their weights, then your algorithm is wrong.

    Consider the following weighted list:
    'a': weight 1
    'b': weight 2
    'c': weight 3
    and x = 2

    In this example, your function should always return 'c' in the result set. This is the only way for 'c' to be chosen 3x as often as 'a' and 1.5x as often as 'b'. But it is trivial to see that your algorithm does not always yield 'c' in the result.

    One algorithm which accomplishes this is to line up the items along the number line from 0 to 1 such that they occupy a segment sized proportionally to their weight, then randomly pick a number "start" between 0 and 1/x, then find all the points "start + n/x" (for all integers n such that the point is between 0 and 1) and yield the set containing the items marked by those points.

    In other words, something like:

    a.) optionally shuffle the list of elements (if you need random combinations of elements in addition to respecting the weights)  
    b.) create a list of cumulative weights, if you will, called borders, such that borders[0] = items[0].weight and borders[i] = borders[i - 1] + items[i].weight  
    c.) calculate the sum of all the weights => total_weight  
    d.) step_size = total_weight / x  
    e.) next_stop = pick a random number between [0, step_size)  
    f.) current_item = 0  
    g.) while next_stop < total_weight:
    h.)   while borders[current_item] < next_stop:  
    i.)     current_item += 1  
    j.)   append items[current_item] to the output  
    k.)   next_stop += step_size
    

    Note: this only works where the biggest weight <= step_size. If one of the elements has a weight greater than the total weight / x, then this problem isn't possible: you'd have to pick an element more than once in order to respect the weights.

    0 讨论(0)
提交回复
热议问题