How To Improve Delete Timeout Issues In CRM 2011 On Prem Dev Environment?

元气小坏坏 提交于 2019-12-10 20:22:15

问题


Background

I have a unit test framework that creates entities for my unit tests, preforms the test, then automagically deletes the entities. It had been working fine except that some entities take 15 - 30 seconds to delete in our dev environment.

I recently received a VM setup in the Amazon Cloud to perform some long term changes requiring a couple release cycles to complete. When I run a unit test on VM, I'm continually getting SQL Timeout Errors attempting to delete the entity.

Steps

I've gone down this set of discovery / action steps:

  1. Turned on tracing, saw that timeout was occurring on fn_CollectForCascadeWrapper which is used to handle cascading deletes. My unit test only has 6 entities in it, and they are deleted in such a way that no cascading deletes are needed. Ran Estimated Execution Plan on it and added some of the indexes it requested. This still didn't fix the timeout issue.
  2. Turned on the Resource Manager on the VM to look at Disk Access / Memory / CPU. When I attempt a delete, the CPU hits 20% for 2 seconds, then drops down to near 0. Memory is unchanged, but Disk Read Access on the Resource Manager Goes crazy high, and stays that way for 7-10 minutes.
  3. Hard Coded the fn_CollectForCascadeWrapper to return a result meaning nothing is required to be cascaded for the 6 entities in my unit test. Ran the unit test and again got the SQL Timeout Error. According to the Tracing, the actual delete statement was timing out:
delete from [New_inquiryExtensionBase] where ([New_inquiryId] = '7e250a5f-890e-40ae-9d2d-c55bbd7250cd');
delete from [New_inquiryBase]
OUTPUT DELETED.[New_inquiryId], 10012
into SubscriptionTrackingDeletedObject (ObjectId, ObjectTypeCode)
where ([New_inquiryId] = '7e250a5f-890e-40ae-9d2d-c55bbd7250cd')
  1. Ran the query manually in SQL Management Studio. Took around 3 minutes to complete. No Triggers on the tables, so I thought the time must be due to the insert. Looked at the SubscriptionTrackingDeletedObject table, and noticed it had 2100 records in it. Deleted all records in the table, and reran my unit test. It actually worked in the normal 15-30 second time frame for deletes.
  2. Researched and discovered what the SubscriptionTrackingDeletedObject is used for, and that the Async Service cleans it up. Noticed that the Async Service was not running on the server. Turned the service on, waited 10 minutes and queried the table again. My 6 entities were still listed there. Looked in trace log and saw timeout errors: Error cleaning up Principal Object Access Table
  3. Researched POA and performed a SELECT COUNT(*) on the table and 7 minutes later it returned 261 million records! Researched how to cleanup the table and the only thing I found was for Role Up 6 (we're currently on 11).

What Next?

Could the POA be affecting the Delete? Or is it just the POA that is affecting the Async Service that is affecting the delete? Could inserting into the SubscriptionTrackingDeletedObject really be causing my problem?


回答1:


I ended up turning on SQL Server Profiling, and running the delete statement listed in my question. It took 3.5 minutes to execute. I was expecting it to be kicking something else off that hit the POA table, but nope, it was just deleting those records.

I took a second look at the Query Execution Plan and noticed there were lots of Nested loops:

that were looking at the child tables that contain a reference to it (see the 13 tiny branches in the tree structure insert in the bottom right?) . So all the reads were being performed on the indexes themselves, and taking forever to get loaded on my uber slow VM.

I ended up running the same query for a different id, and it ran in 2 seconds. I then attempted my unit test, and finally it completed successfully.

I'm guessing each time I attempted a delete, a transaction was started, and then the time out on CRM rolled back the transaction, never allowing the child entity indexes to load. So my current fix is to ensure the child indexes are loaded in memory before actually performing the delete. How I'm going to do that, I'm not sure (perform a query by id for each of the child entities?).

Edit

We had a performance analyst from Microsoft come out and they wrote up a report that was over 200 pages long. 98% said the POA table was too long. Over Christmas we ended up turning off CRM and running some scripts to cleanup the POA table. This has been extremely helpful.



来源:https://stackoverflow.com/questions/19162534/how-to-improve-delete-timeout-issues-in-crm-2011-on-prem-dev-environment

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!