Two left joins gives me untrue data(double data?) with MySQL

后端未结

关注

 3  1382

梦谈多话 2021-01-26 06:56

This is my query:

SELECT `products`.*, SUM(orders.total_count) AS revenue,
    SUM(orders.quantity) AS qty, ROUND(AVG(product_reviews.stars)) as avg_stars 
FROM


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   没有蜡笔的小新
                                             
                
                
                (楼主)
            
              
              
                2021-01-26 07:33
              

            
            
                        
One approach to avoid that problem is to use correlated subquery in the SELECT list, rather than a left join.

SELECT p.*
     , SUM(o.total_count) AS revenue
     , SUM(o.quantity) AS qty
     , ( SELECT ROUND(AVG(r.stars))
           FROM `product_reviews` r
          WHERE r.product_id = p.id 
       ) AS avg_stars
  FROM `products` p
  LEFT
  JOIN `orders` o
    ON o.product_id = p.id
   AND o.status IN ('delivered','new')
 GROUP BY p.id
 ORDER BY p.id DESC
 LIMIT 10
 OFFSET 0


This isn't the only approach, and it's not necessarily the best approach, especially with large sets But given that the subquery will run a maximum of 10 times (given the LIMIT clause), performance should be reasonable (given an appropriate index on product_reviews(product_id,stars).

If you were returning all product ids, or a significant percentage of them, then using an inline view might give better performance (avoiding the nested loops execution of the correlated subquery in the select list)

SELECT p.*
     , SUM(o.total_count) AS revenue
     , SUM(o.quantity) AS qty
     , s.avg_stars
  FROM `products` p
  LEFT
  JOIN `orders` o
    ON o.product_id = p.id
   AND o.status IN ('delivered','new')
  LEFT
  JOIN ( SELECT ROUND(AVG(r.stars)) AS avg_stars
              , r.product_id
           FROM `product_reviews` r
          GROUP BY r.product_id 
       ) s
    ON s.product_id = p.id
 GROUP BY p.id
 ORDER BY p.id DESC
 LIMIT 10
 OFFSET 0




Just to be clear: the issue with the original query is that every order for a product is getting matched to every review for the product.

I apologize if my use of the term "semi-cartesian" was misleading or confusing. 

The idea that I meant to convey by that was that you had two distinct sets (the set of orders for a product, and the set of reviews for a product), and that your query was generating a "cross product" of those two distinct sets, basically "matching" every order to every review (for a particular product).

For example, given three rows in reviews for product_id 101, and two rows in orders for product_id 101, e.g.:

REVIEWS
pid  stars text
---  ----- --------------
101  4.5   woo hoo perfect
101  3     ehh
101  1     totally sucked


ORDERS
pid  date   qty 
---  -----  ---
101  1/13   100
101  1/22   7


Your original query is essentially forming a result set with six rows in it, each row from order being matched to all three rows from reviews:

id   date   qty   stars text
---  ----   ----  ----  ------------
101  1/13   100   4.5   woo hoo perfect
101  1/13   100   3     ehh
101  1/13   100   1     totally sucked
101  1/22   7     4.5   woo hoo perfect
101  1/22   7     3     ehh
101  1/22   7     1     totally sucked


Then, when the SUM aggregate on qty gets applied, the values returned are way bigger than you expect.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复