select 30 random rows where sum amount = x

前端 未结 7 1387
感情败类
感情败类 2020-12-10 02:39

I have a table

items
id int unsigned auto_increment primary key,
name varchar(255)
price DECIMAL(6,2)

I want to get at least 30 random ite

7条回答
  •  自闭症患者
    2020-12-10 03:17

    There is a solution if your product list satisfies the following assumption:

    You have products for all prices between 0.00 and 500.00. eg. 0.01, 0.02 etc to 499.99. or maybe 0.05, 0.10 etc to 499.95.

    The algorithm is based on the following:

    In a collection of n positive numbers that sum up to S, at least one of them will be less than S divided by n (S/n)

    In this case, the steps are:

    1. Select a product randomly where price < 500/30. Get its price, lets say X.
    2. Select a product randomly where price < (500 - X)/29. Get its price, assume Y.
    3. Select a product randomly where price < (500 - X - Y)/28.

    Repeat this 29 times and get 29 products. For the last product, select one where price = remaining price. (or price <= remaining price and order by price desc and hopefully you could get close enough).

    For the table items:

    Get random product max price:

    CREATE PROCEDURE getRandomProduct (IN maxPrice INT, OUT productId INT, productPrice DECIMAL(8,2))
    BEGIN
       DECLARE productId INT;
       SET productId = 0;
           SELECT id, price INTO productId, productPrice
           FROM items
           WHERE price < maxPrice
           ORDER BY RAND()
           LIMIT 1;
    END
    

    Get 29 random products:

    CREATE PROCEDURE get29products(OUT str, OUT remainingPrice DECIMAL(8,2))
    BEGIN
      DECLARE x INT;
      DECLARE id INT;
      DECLARE price DECIMAL(8,2);
      SET x = 30;
      SET str = '';
      SET remainingPrice = 500.00;
    
      REPEAT
        CALL getRandomProduct(remainingPrice/x, @id, @price);
        SET str = CONCAT(str,',', @id);
        SET x = x - 1;
        SET remainingPrice = remainingPrice - @price;
        UNTIL x <= 1
      END REPEAT;
    END
    

    Call the procedure:

    CALL `get29products`(@p0, @p1); SELECT @p0 AS `str`, @p1 AS `remainingPrice`;
    

    and in the end try to find the last product to get to 500.

    Alternatively, you could select 28 and use the solution on the linked question you provided to get a couple of products that sum to the remaining price.

    Note that duplicate products are allowed. To avoid duplicates, you could extend getRandomProduct with an additional IN parameter of the products already found and add a condition NOT IN to exclude them.

    Update: You could overcome the above limitation, so that you always find collections that sum to 500 by using a cron process as described at the 2nd section below.

    2nd section: Using a cron process

    Building on @Michael Zukowski `s suggestion, you could

    • create a table to hold the collections found
    • define a cron process that runs the above algorithm a number of times (in example 10 times) eg. every 5 min
    • if a collection is found that matches the sum, add it to the new table

    This way you can find collections that always sum exactly to 500. When a user makes a request, you could select a random collection from the new table.

    Even with a match rate of 20%, a cron process that runs the algorithm 10 times every 5 minutes in 24h you could more than 500 collections.

    Using a cron process has the following advantages and disadvantages in my opinion:

    Advantages

    • find exact matches
    • no process on client request
    • even with a low match rate, you can find several collections

    disadvantages

    • if the price data are updated frequently, you could have inconsistent results, maybe using a cron process is not gonna work.
    • have to discard or filter old collections
    • it will probably be not random per client, as different client will probably see the same collection.

提交回复
热议问题