F# PSeq.iter does not seem to be using all cores

。_饼干妹妹 提交于 2019-12-05 00:45:17

Based on your updated information, I'm shortening my answer to just the relevant part. You just need this instead of what you currently have:

let result = data |> PSeq.map (calculationFunc >> someFuncToExtractResults)

And this will work the same whether you use PSeq.map or Array.Parallel.map.

However, your real problem is not going to be solved. This problem can be stated as: when the desired degree of parallel work is reached in order to get to 100% CPU usage, there is not enough memory to support the processes.

Can you see how this will not be solved? You can either process things sequentially (less CPU efficient, but memory efficient) or you can process things in parallel (more CPU efficient, but runs out of memory).

The options then are:

  1. Change the degree of parallelism to be used by these functions to something that won't blow your memory:

    let result = data 
                 |> PSeq.withDegreeOfParallelism 2 
                 |> PSeq.map (calculationFunc >> someFuncToExtractResults)
    
  2. Change the underlying logic for calculationFunc >> someFuncToExtractResults so that it is a single function that is more efficient and streams data through to results. Without knowing more detail, it's not simple to see how this could be done. But internally, certainly some lazy loading may be possible.

Array.Parallel.map uses Parallel.For under the hood while PSeq is a thin wrapper around PLINQ. But the reason they behave differently here is there is not enough workloads for PSeq.iter when seq<Calculation> is sequential and too slow in yielding new results.

I do not get the idea of using intermediate seq or array. Suppose data to be the input array, moving all calculations in one place is the way to go:

// Should use PSeq.map to match with Array.Parallel.map
PSeq.map (calculationFunc >> someFuncToExtractResults) data

and

Array.Parallel.map (calculationFunc >> someFuncToExtractResults) data

You avoid consuming too much memory and have intensive computation in one place which leads to better efficiency in parallel execution.

I had a problem similar to yours and solved it by adding the following to the solution's App.config file:

<runtime> 
    <gcServer enabled="true" />
    <gcConcurrent enabled="true"/>
</runtime>

A calculation that was taking 5'49'' and showing roughly 22% CPU utilization on Process Lasso took 1'36'' showing roughly 80% CPU utilization.

Another factor that may influence the speed of parallelized code is whether hyperthreading (Intel) or SMT (AMD) is enabled in the BIOS. I have seen cases where disabling leads to faster execution.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!