How to optimize query if table contain 10000 entries using MySQL?

问题

When I execute this query like this they take so much execution time because user_fans table contain 10000 users entries. How can I optimize it?

Query

SELECT uf.`user_name`,uf.`user_id`,
@post                := (SELECT COUNT(*) FROM post WHERE user_id = uf.`user_id`) AS post,
@post_comment_likes  := (SELECT COUNT(*) FROM post_comment_likes WHERE user_id = uf.`user_id`) AS post_comment_likes,
@post_comments       := (SELECT COUNT(*) FROM post_comments WHERE user_id = uf.`user_id`) AS post_comments,
@post_likes          := (SELECT COUNT(*) FROM post_likes WHERE user_id = uf.`user_id`) AS post_likes,

(@post+@post_comments) AS `sum_post`,
(@post_likes+@post_comment_likes) AS `sum_like`, 
((@post+@post_comments)*10) AS `post_cal`,      
((@post_likes+@post_comment_likes)*5) AS `like_cal`,
((@post*10)+(@post_comments*10)+(@post_likes*5)+(@post_comment_likes*5)) AS `total`  
FROM  `user_fans` uf  ORDER BY `total` DESC lIMIT 20

回答1:

I would try to simplify this COMPLETELY by putting triggers on your other tables, and just adding a few columns to your User_Fans table... One for each respective count() you are trying to get... from Posts, PostLikes, PostComments, PostCommentLikes.

When a record is added to whichever table, just update your user_fans table to add 1 to the count... it will be virtually instantaneous based on the user's key ID anyhow. As for the "LIKES"... Similar, only under the condition that something is triggered as a "Like", add 1.. Then your query will be a direct math on the single record and not rely on ANY joins to compute a "weighted" total value. As your table gets even larger, the queries too will get longer as they have more data to pour through and aggregate. You are going through EVERY user_fan record which in essence is querying every record from all the other tables.

All that being said, keeping the tables as you have them, I would restructure as follows...

SELECT 
      uf.user_name,
      uf.user_id,
      @pc := coalesce( PostSummary.PostCount, 000000 ) as PostCount,
      @pl := coalesce( PostLikes.LikesCount, 000000 ) as PostLikes,
      @cc := coalesce( CommentSummary.CommentsCount, 000000 ) as PostComments,
      @cl := coalesce( CommentLikes.LikesCount, 000000 ) as CommentLikes,
      @pc + @cc AS sum_post,
      @pl + @cl AS sum_like, 
      @pCalc := (@pc + @cc) * 10 AS post_cal,
      @lCalc := (@pl + @cl) * 5 AS like_cal,
      @pCalc + @lCalc AS `total`
   FROM
      ( select @pc := 0,
               @pl := 0,
               @cc := 0,
               @cl := 0,
               @pCalc := 0
               @lCalc := 0 ) sqlvars,
      user_fans uf
        LEFT JOIN ( select user_id, COUNT(*) as PostCount
                       from post
                       group by user_id ) as PostSummary
           ON uf.user_id = PostSummary.User_ID

        LEFT JOIN ( select user_id, COUNT(*) as LikesCount
                       from post_likes
                       group by user_id ) as PostLikes
           ON uf.user_id = PostLikes.User_ID

        LEFT JOIN ( select user_id, COUNT(*) as CommentsCount
                       from post_comment
                       group by user_id ) as CommentSummary
           ON uf.user_id = CommentSummary.User_ID

        LEFT JOIN ( select user_id, COUNT(*) as LikesCount
                       from post_comment_likes
                       group by user_id ) as CommentLikes
           ON uf.user_id = CommentLikes.User_ID

   ORDER BY 
      `total` DESC 
   LIMIT 20

My variables are abbreviated as 
"@pc" = PostCount
"@pl" = PostLikes
"@cc" = CommentCount
"@cl" = CommentLike
"@pCalc" = weighted calc of post and comment count * 10 weighted value
"@lCalc" = weighted calc of post and comment likes * 5 weighted value

The LEFT JOIN to prequeries runs those queries ONCE through, then the entire thing is joined instead of being hit as a sub-query for every record. By using the COALESCE(), if there are no such entries in the LEFT JOINed table results, you won't get hit with NULL values messing up the calcs, so I've defaulted them to 000000.

CLARIFICATION OF YOUR QUESTIONS

You can have any QUERY as an "AS AliasResult". The "As" can also be used to simplify any long table names for simpler readability. Aliases can also be using the same table but as a different alias to get similar content, but for different purpose.

select
      MyAlias.SomeField
   from
      MySuperLongTableNameInDatabase MyAlias ...

select
      c.LastName,
      o.OrderAmount
   from
      customers c
         join orders o
            on c.customerID = o.customerID  ...

select
      PQ.SomeKey
   from
      ( select ST.SomeKey
           from SomeTable ST
           where ST.SomeDate between X and Y ) as PQ
         JOIN SomeOtherTable SOT
            on PQ.SomeKey = SOT.SomeKey ...

Now, the third query above is not practical requiring the ( full query resulting in alias "PQ" representing "PreQuery" ). This could be done if you wanted to pre-limit a certain set of other complex conditions and wanted a smaller set BEFORE doing extra joins to many other tables for all final results.

Since a "FROM" does not HAVE to be an actual table, but can be a query in itself, any place else used in the query, it has to know how to reference this prequery resultset.

Also, when querying fields, they too can be "As FinalColumnName" to simplify results to where ever they will be used too.

select CONCAT( User.Salutation, User.LastName ) as CourtesyName from ...

select Order.NonTaxable + Order.Taxable + ( Order.Taxable * Order.SalesTaxRate ) as OrderTotalWithTax from ...

The "As" columnName is NOT required being an aggregate, but is most commonly seen that way.

Now, with respect to the MySQL variables... If you were doing a stored procedure, many people will pre-declare them setting their default values before the rest of the procedure. You can do them in-line in a query by just setting and giving that result an "Alias" reference. When doing these variables, the select will simulate always returning a SINGLE RECORD worth of the values. Its almost like an update-able single record used within the query. You don't need to apply any specific "Join" conditions as it may not have any bearing on the rest of the tables in a query... In essence, creates a Cartesian result, but one record against any other table will never create duplicates anyhow, so no damage downstream.

select 
       ...
   from 
      ( select @SomeVar := 0,
               @SomeDate := curdate(),
               @SomeString := "hello" ) as SQLVars

Now, how the sqlvars work. Think of a linear program... One command is executed in the exact sequence as the query runs. That value is then re-stored back in the "SQLVars" record ready for the next time through. However, you don't reference it as SQLVars.SomeVar or SQLVars.SomeDate... just the @SomeVar := someNewValue. Now, when the @var is used in a query, it is also stored as an "As ColumnName" in the result set. Some times, this can be just a place-holder computed value in preparation of the next record. Each value is then directly available for the next row. So, given the following sample...

select
      @SomeVar := SomeVar * 2 as FirstVal,
      @SomeVar := SomeVar * 2 as SecondVal,
      @SomeVar := SomeVar * 2 as ThirdVal
   from
      ( select @SomeVar := 1 ) sqlvars,
      AnotherTable
   limit 3

Will result in 3 records with the values of 

FirstVal    SecondVal   ThirdVal
2           4           8
16          32          64
128         256         512

Notice how the value of @SomeVar is used as each column uses it... So even on the same record, the updated value is immediately available for the next column... That said, now look at trying to build a simulated record count / ranking per each customer...

select
      o.CustomerID,
      o.OrderID
      @SeqNo := if( @LastID = o.CustomerID, @SeqNo +1, 1 ) as CustomerSequence,
      @LastID := o.CustomerID as PlaceHolderToSaveForNextRecordCompare
   from
      orders o,
      ( select @SeqNo := 0, @LastID := 0 ) sqlvars
   order by
      o.CustomerID

The "Order By" clause forces the results to be returned in sequence first. So, here, the records per customer are returned. First time through, LastID is 0 and customer ID is say...5. Since different, it returns 1 as the @SeqNo, THEN it preserves that customer ID into the @LastID field for the next record. Now, next record for customer... Last ID is the the same, so it takes the @SeqNo (now = 1), and adds 1 to 1 and becomes #2 for the same customer... Continue on the path...

As for getting better at writing queries, take a look at the MySQL tag and look at some of the heavy contributors. Look into the questions and some of the complex answers and how problems solving works. Not to say there are not others with lower reputation scores just starting out and completely competent, but you'll find who gives good answers and why's. Look at their history of answers posted too. The more you read and follow, the more you'll get a better handle on writing more complex queries.

回答2:

You can convert this query to Group By clause, instead of using Subquery for each column.
You can create indexes on the relationship parameters ( it will be the most helpful way of optimizing your query response ).

回答3:

1000 user records isn't much data at all.

There may be work you can do on the database itself:

1) Have you got the relevant indexes set on the foreign keys (indexes set on user_id in each of the tables)? Try running EXPLAIN before the query http://www.slideshare.net/phpcodemonkey/mysql-explain-explained

2) Are your data types correct?

回答4:

See the difference between @me(see image 1) and @DRapp(see image 2) Query execution time and explain. When i read @Drapp answer i realized that what am i doing wrong in this query and why my query take so much time basically answer is so simple my query dependent on subquery or @Drapp used derived (temporary/file sort) with the help of session variables , Alias and joins...