I need a table to store some ratings, in this table I have a composite index (user_id, post_id) and other column to identify different rating system.
<The missing of PRIMARY KEY may cause performance problem?
Yes in InnoDB for sure, as InnoDB will use a algorithm to create it's own "ROWID", Which is defined in dict0boot.ic
Returns a new row id.
@return the new id */
UNIV_INLINE
row_id_t
dict_sys_get_new_row_id(void)
/*=========================*/
{
row_id_t id;
mutex_enter(&(dict_sys->mutex));
id = dict_sys->row_id;
if (0 == (id % DICT_HDR_ROW_ID_WRITE_MARGIN)) {
dict_hdr_flush_row_id();
}
dict_sys->row_id++;
mutex_exit(&(dict_sys->mutex));
return(id);
}
The main problem in that code is mutex_enter(&(dict_sys->mutex)); which blocks others threads from accessing if one thread is already running this code.
Meaning it will table lock the same as MyISAM would.
% may take a few nanoseconds. That is insignificant compared to everything else. Anyway #define DICT_HDR_ROW_ID_WRITE_MARGIN 256
Indeed yes Rick James this is indeed insignificant compared to what was mentioned above.
The C/C++ compiler would micro optimize it more to to get even more performance out off it by making the CPU instructions lighter.
Still the main performance concern is mentioned above..
Also the modulo operator (%) is a CPU heavy instruction.
But depening on the C/C++ compiler (and/or configuration options) if might be optimized if DICT_HDR_ROW_ID_WRITE_MARGIN is a power of two.
Like (0 == (id & (DICT_HDR_ROW_ID_WRITE_MARGIN - 1))) as bitmasking is much faster, i believe DICT_HDR_ROW_ID_WRITE_MARGIN indeed had a number which is a power of 2
A few points:
It sounds like you are just using what is currently unique about the table and making that as a primary key. That works. And natural keys have some advantages when it comes to querying because of locality. (The data for each user is stored in the same area). And because the table is clustered by that key which eliminates lookups to the data if you are searching by the columns in the primary.
But, using a natural primary key like you chose has disadvantages for performance as well.
Using a very large primary key will make all other indexes very large in innodb because the primary key is included in each index value.
Using a natural primary key isn't as fast as a surrogate key for INSERT's because in addition to being bigger it can't just insert at the end of the table each time. It has to insert in the section for that user and post etc.
Also, if u are searching by time most likely you will be seeking all over the table with a natural key unless time is your first column. surrogate keys tend to be local for time and can often be just right for some queries.
Using a natural key like yours as a primary key can also be annoying. What if you want to refer to a particular vote? You need a few fields. Also it's a little difficult to use with lots of ORMs.
Here's the Answer
I would create your own surrogate key and use it as a primary key rather than rely on innodb's internal primary key because you'll be able to use it for updates and lookups.
ALTER TABLE tbl_rate
ADD id INT UNSIGNED NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY(id);
But, if you do create a surrogate primary key, I'd also make your key a UNIQUE. Same cost but it enforces correctness.
ALTER TABLE tbl_rate
ADD UNIQUE ( user_id, post_id, type );