0x80010100: System call failed" exception, ContextSwitchDeadlock

亡梦爱人 提交于 2019-12-12 03:15:02

问题


Long story short: in a C# application that works with COM inproc-server (dll), I encounter "0x80010100: System call failed" exception, and in debug mode also ContextSwitchDeadlock exception.

Now more in details:

1) C# app initializes STA, creates a COM object (registered as "Apartment"); then in subscribes to its connection-point, and begins working with the object.

2) At some stage the COM object generates a lot of events, passing as an argument a very big collection of COM objects, which are created in the same apartment.

3) The event-handler on C# side processes the above collection, occasionally calling some methods of the objects. At some stage the latter calls begin to fail with the above exceptions.

On the COM side the apartment uses a hidden window whose winproc looks like this:

typedef std::function<void(void)> Functor;
LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)   
{   
  switch(msg)   
  {   
    case AM_FUNCTOR:
    {
      Functor *f = reinterpret_cast<Functor *>(lParam);
      (*f)();
      delete f;
    }
    break;   
    case WM_CLOSE:   
      DestroyWindow(hwnd);   
    break;   
    default:   
      return DefWindowProc(hwnd, msg, wParam, lParam);   
  }   
  return 0;   
} 

The events are posted to this window from other parts of the COM server:

void post(const Functor &func)
{
  Functor *f = new Functor(func);
  PostMessage(hWind_, AM_FUNCTOR, 0, reinterpret_cast<LPARAM>(f));
}

The events are standard ATL CP implementations bound with the actual params, and they boil down to something like this:

pConnection->Invoke(id, IID_NULL, LOCALE_USER_DEFAULT, DISPATCH_METHOD, &params, &varResult, NULL, NULL);

In C# the handler looks like this:

private void onEvent(IMyCollection objs)
{
  int len = objs.Count; // usually 10000 - 25000
  foreach (IMyObj obj in objs)
  {
    // some of the following calls fail with 0x80010100
    int id = obj.id;
    string name = obj.name;
    // etc...
  }
}

==================

So, can the above problem happen just because the message-queue of the apartment is too loaded with the events it tries to deliver? Or the message loop should be totally blocked to cause such a behaviour?

Lets assume that the message-queue has 2 sequential events that evaluate to "onEvent" call. The first one enters C# managed code, which attempts to re-enter the unmanaged code, the same apartment. Usually, this is allowed, and we do this a lot. When, under what circumstances can it fail?

Thanks.


回答1:


This ought to work even with multiple apartments provided that:

  • Only one of the threads responds to external events such as network traffic, timers, posted messages etc.
  • Other threads only service COM requests (even if they call back to the main thread during the processing).

AND

  • neither thread queue ever gets full, preventing COM from communicating with the thread.

Firstly: It looks like some objects are not in the same apartment as other objects. Are you sure that all objects are being created in the STA?

What you are describing is a classic deadlock - two independent threads, each waiting on the other. That is what I would expect to occur with that design operating with the C# and COM sides on different threads.

You should be OK if all the objects are on the same thread, as well as the hidden window being on that thread, so I think you need to check that. (Obviously this includes any other objects which are created by the COM side and passed over to the C# side.)

You could try debugging this by pressing "pause" in the debugger and checking what code was in each thread (if you see RPCRT*.DLL this means you are looking at a proxy). Alternately you could DebugPrint the current thread ID from various critical points in both C# and COM sides and your WndProc - they should all be the same.

Secondly: it ought to work with multiple threads provided that only one of the threads generates work items, and the other does nothing but host COM objects which respond to calls (i.e. doesn't generate calls from timers, network traffic, posted messages etc), in this case it may be that the thread queue is full and COM cannot reply to a call.

Instead of using the thread queue, you should use a deque protected by a critical section.

  • http://msdn.microsoft.com/en-us/library/windows/desktop/ms644944(v=vs.85).aspx

    There is a limit of 10,000 posted messages per message queue. This limit should be sufficiently large. If your application exceeds the limit, it should be redesigned to avoid consuming so many system resources.

You might maintain a counter of items on/off the queue to see if this is the issue.

来源:https://stackoverflow.com/questions/9376773/0x80010100-system-call-failed-exception-contextswitchdeadlock

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!