Slow merge replication over a WAN link - only downloads

北战南征 提交于 2020-01-26 04:09:25

问题


We've been using SQL Server merge replication for a few years to synchronise data between our data centres, but we are now suffering with a big performance issue. This may be because the amount of data we are synchronising has increased a lot this year.

Our publisher is an always-on data centre in the UK. Our subscriber is a mobile data centre that travels around the world and is on for periods of up to a week at a time, approx. 25 times a year. However, it also spends the same amount of time (if not more) switched off whilst on its travels - it is a well travelled data centre!

We have 5 database that we synchronise on these servers. However, one of our databases has high numbers of data changes between periods of subscriber downtime and our issue is that it take days to catch up when the server is powered up - the other databases are fine.

Downloads from publisher to subscriber run at about 1.5 rows a second (which is annoying when we have hundreds of thousands of rows) but strangely uploads from subscriber to publisher run about ten times faster.

Things I have checked / tried: • all tables have non-clustered primary keys on guid columns that have the rowguid property set • changing the generation levelling threshold doesn't help • setting the agent profile to high volume doesn't help • running a trace at the publisher and subscriber shows the queries are all running very fast (less than 20 m/s generally, but there are gaps of 200 m/s or so between some batches of queries) • analysis on our WAN link shows we have huge amounts of bandwidth spare • analysis on our servers show we have huge amounts of Ram and CPU spare

Some of the places the subscriber is at do suffer from high latency but this doesn't seem to have an impact - 300 m/s or 100m/s and we still get the same poor performance.

One things I did wonder about - does the replication confirm to the publisher every time it has successfully processed a row at the subscriber? If we have thousands of rows and there is a latency on the line will this compound the issue if it confirms each item? If this does happen, is there a way to batch up messages between publisher and subscriber?

Any help that you can offer will be gladly received!

Thanks

Mark


回答1:


We got to the bottom of this in the end.

It was our use of nvarchar(max) columns that was stopping the replication from using batches. What used to take 3 hours now takes 50 seconds just by changing the data type.

Here is the lesson learnt: "nvarchar(max) is a replication killer"

Thanks



来源:https://stackoverflow.com/questions/28965983/slow-merge-replication-over-a-wan-link-only-downloads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!