Amazon QLDB have any scaling/performance limits?

问题

The main Amazon QLDB page says

QLDB is also serverless, so it automatically scales to support the demands of your application.

However, even products like DynamoDB—with practically unbounded automatic scaling—have some scaling limits. (For example, DynamoDB has a max of 3k RCU per partition key.)

I’m trying to find out the scaling/performance limits of QLDB. Is there any max TPS or max throughput per key, table, ledger, or account? Is there a maximum storage size per table or ledger or account?

As of October 2019, there’s no mention of any scaling limits on the QLDB Quotas and Limits page.

The QLDB FAQ page says,

Amazon QLDB can execute 2 – 3X as many transactions than ledgers in common blockchain frameworks.

That’s a start, but it’s not very helpful because “2-3X” is a relatively wide range, and they haven’t specified which blockchain frameworks they consider common.

Has anyone found any info on this (in the documentation, in AWS blog posts, from a deep dive session, etc) whether or not there are any such limits?

回答1:

As you note, with any system there are limits. The only true answer to your question would require benchmarking your use case to see what numbers you get. I don't want to mislead you!

That said, I can help you understand some QLDB fundamentals which will help you build a mental model for how the system should behave for different workloads.

The first concept to understand is the document-revision model. In QLDB, documents are inserted and then updated (revised) and then deleted. Each document has a QLDB-assigned UUID and each revision has a QLDB-assigned (strictly monotonically increasing and dense) version number. Documents can be revised by issuing transactions (sending PartiQL statements) over a QLDB session.

Next, transactions. Transactions typically read some state and then either continue or abandon. For example, if you are building a banking application with the use case of transferring money from Mary to Joe, the transaction may be "read the balance of Mary", "read the balance of Joe", "set the balance of Mary" and "set the balance of Joe". In between, your application can enforce constraints. For example, if it determines that Mary's balance is less than the transferred amount, it would abandon the transaction. If this transaction succeeds, two new revisions are created (one for the new bank account of Mary, and one for Joe).

The next concept is Optimistic Concurrency Control (OCC), which is explained at https://docs.aws.amazon.com/qldb/latest/developerguide/concurrency.html. When you attempt to commit a transaction, QLDB will reject it if another transaction interfered with the one you are attempting to commit. For example, if another withdrawal was made from Mary's account (after you read the balance), your commit will fail due to an OCC conflict, allowing you to retry the transaction (and re-check that Mary still has enough money). Thus, the nature of your transactions will affect your performance. If you are reading account balances and then producing new balances based off the read, then you will have lower throughput than if you are creating new accounts or changing accounts to random amounts (neither of which require any reads).

The fourth concept is that of the Journal. QLDB is a "Journal first" database: all transactions are first written to a distributed log which is then used to update indexed storage. The QLDB architecture abstracts the physical log implementation for you but does expose the concept of a "strand", which is a partition of the Journal. Each strand has a fixed amount of capacity (new revisions per second). QLDB currently (late 2019) restricts each ledger to a single strand.

Putting this together, hopefully I can help you with your questions:

Max TPS. The theoretical upper-bound is the max TPS of a single strand. There isn't a single fixed number, as various factors may influence it, but it is many thousands of TPS.
Max TPS per document. This will never exceed the max TPS, but will be bound more by OCC than anything else. If you are simply inserting new documents (no reads) you will have zero OCC conflicts. If you are reading, you will be bound by the time it takes us to update our indexed storage from the Journal. 100 TPS is a good starting point.
Max per table. There are no per-table limits, other than those imposed by other limits (i.e. the per-document limit or the strand limit).
Max per account. We have no account-wide limits on the "QLDB Session" API. Each ledger is an island.
Max size per table, ledger or account. There are no limits here.

A note on sessions: we have a default limit of 1500 sessions to QLDB. Each session can only have 1 active transaction, and each transaction takes some amount of time either due to PartiQL query time, network round-trips, or work your application is doing with results. This will impose an upper bound on your performance. We do allow customers to increase this limit, as described at https://docs.aws.amazon.com/qldb/latest/developerguide/limits.html.

With regards to the other part of your question (documentation, examples and learning materials), I can provide some information. QLDB was released last month, so re:Invent 2019 is the first opportunity we have to engage with customers and gain direct feedback on where developers need more help. We gave a 300-level talk at re:Invent 2018 and will do another one this year. I will be giving a "Chalk Talk" on our Journal-first architecture and will cover some of these concepts. The session will be recorded and uploaded to YouTube, but the Chalk Talks require you to be there in person. But either way, this is just one of many opportunities we have to engage and better explain the QLDB architecture, benefits and limitations. Feel free to keep asking questions and we'll do our best to answer them and improve the quality of documentation available. In terms of the "2-3x claim", this number was determined by building real-world use cases (such as the banking example) against blockchain frameworks and QLDB, and distilling those learnings into a single number. We believe the centralized nature of QLDB can provide many benefits if one doesn't need a distributed ledger, and performance is one of them. If you have specific use cases where QLDB is not faster than the same use case on a blockchain framework, we'd love to hear about those.

来源：https://stackoverflow.com/questions/58254582/amazon-qldb-have-any-scaling-performance-limits

标签

amazon-qldb