Difference between local and global indexes in DynamoDB

前端 未结 7 2010
轻奢々
轻奢々 2020-12-22 15:42

I\'m curious about these two secondary indexes and differences between them. It is hard to imagine how this looks like. And I think, this will help more people than just me.

7条回答
  •  眼角桃花
    2020-12-22 16:26

    One way to put it is this:

    LSI - allows you to perform a query on a single Hash-Key while using multiple different attributes to "filter" or restrict the query.

    GSI - allows you to perform queries on multiple Hash-Keys in a table, but costs extra in throughput, as a result.

    A more extensive breakdown of the table types and how they work, below:

    Hash Only

    As you probably already know; a Hash-Key by itself must be unique as writing to a Hash-Key that already exists will overwrite the existing data.

    Hash+Range

    A Hash-Key + Range-Key allows you to have multiple Hash Keys that are the same, as long as they have a different range key. In this case, if you write to a Hash-Key that already exists, but use a Range-Key that is not already used by that Hash-Key, it makes a new item, whereas if an item with the same Hash+Range combination already exists, it overwrites the matching item.

    Another way to think of this is like a file with a format. You can have a file with the same name (hash) as another, in the same folder (table), as long as their format (range) is different. Likewise, you can have multiple files of the same format as long as their name is different.

    LSI

    An LSI is basically the same as a Hash-Key + Range-Key, and follows the same rules as it, when creating items, except that you must also provide values for the LSIs, as well; they cannot be left empty/null.

    To say an LSI is "Range-Key 2" is not entirely correct as you cannot have (using my file and format analogy from earlier) a file named: file.format.lsi and file.format.lsi2. You can, however, have file.format.lsi and file.format2.lsi or file.format.lsi and file2.format.lsi.

    Basically, an LSI is just a "Filter-key", not an actual Range-Key; your base Hash and Range value combination must still be unique while the LSI values do not have to be unique, at all. An easier way to look at it may be to think of the LSI as data within the files. You could write code that finds all the files with the name "PROJECT101", regardless of their fileFormat, then reads the data inside to determine what should be included in the query and what is omitted. This is basically how LSI works (just without the extra overhead of opening the file to read its contents).

    GSI

    For GSI, you're essentially creating another table for each GSI, but without the hassle of maintaining multiple separate tables that mirror data between them; this is why they cost more throughput.

    So for a GSI, you could specify fileName as your base Hash-Key, and fileFormat as your base Range-Key. You can then specify a GSI that has a Hash-Key of fileName2 and a Range-Key of fileFormat2. You can then query on either fileName or fileName2 if you like, unlike LSI where you can only query on fileName.

    The main advantages are that you only have to maintain one table, instead of 2, and anytime you write to either the primary Hash/Range or the GSI Hash/Range(s), the other(s) will automatically be updated as well, so you can't "forget" to update the other table(s) like you can with a multi-table setup. Also, there's no chance of a lost connection after updating one and before updating the other, like there is with the multi-table setup.

    Additionally, a GSI can "overlap" the base Hash/Range combination. So if you wanted to make a table with fileName and fileFormat as your base Hash/Range and filePriority and fileName as your GSI, you can.

    Lastly, a GSI Hash+Range combination does not have to be unique, while the base Hash+Range combination does have to be unique. This is something that is not possible with a dual/multi table setup, but is with GSI. As a result, you MUST provide values for both the base AND GSI Hash+Range, when updating; none of these values can be empty/null.

提交回复
热议问题