Getting a very bad performance with galera as compared to a standalone mariaDB server

三世轮回 提交于 2021-01-21 05:47:16

问题


I am getting an unacceptable low performance with the galera setup i created. In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer.

I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. For information, ha-proxy is hosted as a standalone node on a different server. I have removed ha-proxy as of now but there is no improvement in performance.

Can someone please suggest some tuning parameters in my.cnf i should consider to tune this severely under-performing setup.

I am using the below my.cnf file:


回答1:


I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration: 36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe

With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy.

Clusters based on Galera are not designed to scale writes as I see you intend to do; In fact, as Rick mentioned above: sending writes to multiple nodes for the same tables will end up causing certification conflicts that will reflect as deadlocks for your application, adding huge overhead.

I am getting an unacceptable low performance with the galera setup i created. In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer.

Please send all writes to a single node and see if that improves performane; There will always be some overhead due to the nature of virtually-synchronous replication that Galera uses, which literally adds network overhead to each write you perform (albeit true clock-based parallel replication will offset this impact quite a bit, still you are bound to see slightly lower throughput volumes).

Also make sure to keep your transactions short and COMMIT as soon as you are done with an atomic unit of work, since replication-certification process is single-threaded and will stall writes on the other nodes (if you see that your writer node shows transactions wsrep pre-commit stage that means the other nodes are doing certification for a large transaction or that the node is suffering performance problems of some sort -swap, full disk, abusively large reads, etc.

Hope that helps, and let us know how it goes when you move to single node.




回答2:


Turn off the QC:

query_cache_size = 0  -- not 22 bytes
query_cache_type = OFF -- QC is incompatible with Galera

Increase innodb_io_capacity

How far apart (ping time) are the two nodes?

Suggest you pretend that it is Master-Slave. That is, have HAProxy send all traffic to one node, leaving the other as a hot backup. Certain things can run faster in this mode; I don't know about your app.



来源:https://stackoverflow.com/questions/41610698/getting-a-very-bad-performance-with-galera-as-compared-to-a-standalone-mariadb-s

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!