Ignite service hangs when call cache remove in another cache's invoke processor, “ Possible starvation in striped pool”?

隐身守侯 提交于 2019-12-11 15:55:50

问题


Ignite logs have starvation waringings and stop to provide service:

[12:55:22,080][WARNING][grid-timeout-worker-#71][G] >>> Possible starvation in striped pool.
    Thread name: sys-stripe-25-#26
    Deadlock: false
    Completed: 16272032
Thread [name="sys-stripe-25-#26", id=51, state=WAITING, blockCnt=79, waitCnt=15616666]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
    at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
    at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.remove0(GridDhtAtomicCache.java:716)
    at o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3084)
    at o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3065)
    at o.a.i.i.processors.cache.IgniteCacheProxyImpl.remove(IgniteCacheProxyImpl.java:1131)
    at o.a.i.i.processors.cache.GatewayProtectedCacheProxy.remove(GatewayProtectedCacheProxy.java:998)
    at com.test.info.TestInfoBasicExecutor.handleCurrentLevel(TestInfoBasicExecutor.java:281)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:514)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:453)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.runEntryProcessor(GridCacheMapEntry.java:5142)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4550)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4367)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3051)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2945)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1717)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1600)
    at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1199)
    at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1357)
    at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:345)
    at o.a.i.i.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:1767)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2420)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1883)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1736)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1628)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3055)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:130)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:266)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:261)
    at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
    at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
    at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
    at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
    at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
    at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
    at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
    at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
    at o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
    at o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
    at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
    at java.lang.Thread.run(Thread.java:745)

I use invoke to update Cache A, and in the etnryprocessor of cache A, I konw the processor is already invoked wiht a lock, and i just doing update for another cacher base this entry, I have checked the value of Cache A, and based on the value, do update to cache B entries, i.e. put or remove, in my test, put is ok, but for remove it seems the remove cause service hangs:

    at com.test.info.TestInfoBasicExecutor.handleCurrentLevel(TestInfoBasicExecutor.java:281)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:514)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:453)

======================================================

Update 0702:

To prevent the starvation, i changed my code:

In Ignite Service A's excute function:

cacheA.invoke(record){ // do process to record

igniteQueue.put(processed_record);

}

In Ignite Service B's excute function:

saved_processed_record = igniteQueue.take();

=================

I have try to use this way to prevent the starvation, It runs smoothly when the old code with starvation(TPS is low), but when i running with high TPS, the "Possible starvation in striped pool" back again,

It seems I use igniteQueue in cache.invoke is also not correct vs. previous cache in cache.invoke

When i want is do process for each record in cache, and then base the processed record to update other caches, but it seems it's not possilbe?


回答1:


You should avoid doing cache operations within the entry processor, even if those operations belong to other caches. The reason for that is that all these operations will use the same thread pool - this can cause starvation.




回答2:


Striped pool used for processing of the Ignite messages. Looks like for some reason all the threads from this pool is waiting for some operation (remove from cache in your log). It could be related to network problems or removing takes a lot of time (for example you are going to remove all the data).

Could you please attach the thread dump and your test code for investigation?



来源:https://stackoverflow.com/questions/50818322/ignite-service-hangs-when-call-cache-remove-in-another-caches-invoke-processor

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!