using Array.Parallel.map for decreasing running time

孤者浪人 提交于 2019-12-22 06:55:57

问题


Hello everyone

I have converted a project in C# to F# that paints the Mandelbrot set.
Unfortunately does it take around one minute to render a full screen so I am try to find some ways to speed it up.

It is one call that take almost all of the time:

Array.map (fun x -> this.colorArray.[CalcZ x]) xyArray

xyArray (double * double) [] => (array of tuple of double)
colorArray is an array of int32 length = 255

CalcZ is defined as:

 let CalcZ (coord:double * double) =

    let maxIterations = 255

    let rec CalcZHelper (xCoord:double) (yCoord:double) // line break inserted
           (x:double) (y:double) iters =
        let newx = x * x + xCoord - y * y
        let newy = 2.0 * x * y + yCoord
        match newx, newy, iters with
        | _ when Math.Abs newx > 2.0 -> iters
        | _ when Math.Abs newy > 2.0 -> iters
        | _ when iters = maxIterations -> iters
        | _ -> CalcZHelper xCoord yCoord newx newy (iters + 1)

    CalcZHelper (fst coord) (snd coord) (fst coord) (snd coord) 0

As I only use around half of the processor capacity is an idea to use more threads and specifically Array.Parallel.map, translates to system.threading.tasks.parallel

Now my question

A naive solution, would be:

Array.Parallel.map (fun x -> this.colorArray.[CalcZ x]) xyArray  

but that took twice the time, how can I rewrite this to take less time, or can I take some other way to utilize the processor better?

Thanks in advance
Gorgen

---edit---
the function that is calling CalcZ looks like this:

          let GetMatrix =
            let halfX = double bitmap.PixelWidth * scale / 2.0
            let halfY = double bitmap.PixelHeight * scale / 2.0
            let rect:Mandelbrot.Rectangle = 
                {xMax = centerX + halfX; xMin = centerX - halfX;
                 yMax = centerY + halfY; yMin = centerY - halfY;}

            let size:Mandelbrot.Size = 
                {x = bitmap.PixelWidth; y = bitmap.PixelHeight}

            let xyList = GenerateXYTuple rect size
            let xyArray = Array.ofList xyList
            Array.map (fun x -> this.colorArray.[CalcZ x]) xyArray

        let region:Int32Rect = new Int32Rect(0,0,bitmap.PixelWidth,bitmap.PixelHeight)
        bitmap.WritePixels(region, GetMatrix, bitmap.PixelWidth * 4, region.X, region.Y);

GenerateXYTuple:

let GenerateXYTuple (rect:Rectangle) (pixels:Size) =
    let xStep = (rect.xMax - rect.xMin)/double pixels.x
    let yStep = (rect.yMax - rect.yMin)/double pixels.y
    [for column in 0..pixels.y - 1 do
       for row in 0..pixels.x - 1 do
         yield (rect.xMin + xStep * double row,
           rect.yMax - yStep * double column)]

---edit---

Following a suggestion from kvb (thanks a lot!) in a comment to my question, I built the program in Release mode. Building in the Relase mode generally speeded up things.

Just building in Release took me from 50s to around 30s, moving in all transforms on the array so it all happens in one pass made it around 10 seconds faster. At last using the Array.Parallel.init brought me to just over 11 seconds.

What I learnt from this is.... Use the release mode when timing things and using parallel constructs... One more time, thanks for the help I have recieved.
--edit--
by using SSE assember from a native dll I have been able to slash the time from around 12 seconds to 1.2 seconds for a full screen of the most computational intensive points. Unfortunately I don't have a graphics processor...

Gorgen


回答1:


As an aside, it looks like you're generating an array of coordinates and then mapping it to an array of results. You don't need to create the coordinate array if you use the init function instead of map: Array.Parallel.init 1000 (fun y -> Array.init 1000 (fun x -> this.colorArray.[CalcZ (x, y)]))

EDIT: The following may be inaccurate: Your problem could be that you call a tiny function a million times, causing the scheduling overhead to overwhelm that actual work you're doing. You should partition the array into much larger chunks so that each individual task takes a millisecond or so. You can use an array of arrays so that you would call Array.Parallel.map on the outer arrays and Array.map on the inner arrays. That way each parallel operation will operate on a whole row of pixels instead of just a single pixel.




回答2:


Per the comment on the original post, here is the code I wrote to test the function. The fast version only takes a few seconds on my average workstation. It is fully sequential, and has no parallel code.

It's moderately long, so I posted it on another site: http://pastebin.com/Rjj8EzCA

I'm suspecting that the slowdown you are seeing is in the rendering code.




回答3:


I don't think that the Array.Parallel.map function (which uses Parallel.For from .NET 4.0 under the cover) should have trouble parallelizing the operation if it runs a simple function ~1 million times. However, I encountered some weird performance behavior in a similar case when F# didn't optimize the call to the lambda function (in some way).

I'd try taking a copy of the Parallel.map function from the F# sources and adding inline. Try adding the following map function to your code and use it instead of the one from F# libraries:

let inline map (f: 'T -> 'U) (array : 'T[]) : 'U[]=
  let inputLength = array.Length
  let result = Array.zeroCreate inputLength
  Parallel.For(0, inputLength, fun i ->
    result.[i] <- f array.[i]) |> ignore
  result


来源:https://stackoverflow.com/questions/4184644/using-array-parallel-map-for-decreasing-running-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!