问题
I've written a function getSamplesFromFile
that takes a file and returns its contents as a Vector
of Floats
. The functions reads the contents of the file into a Data.ByteString
using Data.ByteString.hGet
, it then converts this Data.ByteString
to a Vector
of Floats
using:
import qualified Data.Vector.Unboxed as V
import qualified Data.ByteString as BS
import Data.Word
import System.Environment
import GHC.Int
toVector :: BS.ByteString -> V.Vector Float
toVector bs = vgenerate (fromIntegral (BS.length bs `div` 3)) $ \i ->
myToFloat [BS.index bs (3*i), BS.index bs (3*i+1), BS.index bs (3*i+2)]
where
myToFloat :: [Word8] -> Float
myToFloat = sum . map fromIntegral
vgenerate n f = V.generate n (f . fromIntegral)
I was testing how lazy this program was via a small test program:
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
print slice
If I run this on a 13MB file, it seems as if every sample is loaded into memory, even though I only need 50000 samples to be printed. If I make a small modification to this problem and first map or filter over it, the result is different:
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
let mapped = V.map id slice
print mapped
This way, it seems that not every sample was loaded into memory, only the slice:
To make sure this was the case, I ran the program again with a slice of half the size (25000 samples):
Now, the memory usage seems to be proportional to the size of the slice. Just because I map over the slice with id
.
The result is the same when filtering over the samples. How can applying a higher-order function suddenly make the behavior lazy?
EDIT
The problem seems to have to do something with cabal
. As you can see from the pictures, I was testing my code inside a cabal
project called laziness. I can't reproduce this weird behavior if use a separate Main.hs
file outside of a cabal project. This is the Main.hs
I'm using:
module Main where
import qualified Data.ByteString as BS
import qualified Data.Vector.Unboxed as V
import Data.Word
import GHC.Int
import System.Environment
main = do
[file] <- getArgs
samples <- getSamplesFromFile file
let slice = V.slice 0 50000 samples
--let filtered = V.filter (>0) slice
let mapped = V.map id slice
print slice
getSamplesFromFile = fmap toVector . BS.readFile
toVector :: BS.ByteString -> V.Vector Float
toVector bs = vgenerate (fromIntegral (BS.length bs `div` 3)) $ \i ->
myToFloat [BS.index bs (3*i), BS.index bs (3*i+1), BS.index bs (3*i+2)]
where
myToFloat :: [Word8] -> Float
myToFloat = sum . map fromIntegral
vgenerate n f = V.generate n (f . fromIntegral)
I don't experience the weird behavior if I do the following:
- Create a new directory somewhere via
mkdir
- Add the above
Main.hs
to the directory. - Compile using
ghc Main.hs -O2 -rtsopts -prof
. - Run via
./Main myfile.wav +RTS -hy
. - Create the pdf using
hp2ps
andps2pdf
.
I do experience the weird behavior if I do the following:
- Create a new directory, laziness, via
mkdir laziness
. - Initiate a
cabal
project viacabal init
. - Add the above
Main.hs
to/src
. - Add
ghc-options: -O2 -rtsopts -prof
tolaziness.cabal
. - Compile using
cabal install
- Run via
laziness myfile.wav +RTS -hy
. - Create the pdf using
hp2ps
andps2pdf
.
I even experience the weird behavior if I:
cd laziness/src
- Compile using
ghc Main.hs -O2 -rtsopts -prof
. - Run via
./Main myfile.wav +RTS -hy
. - Create the pdf using
hp2ps
andps2pdf
.
So it seems that this behavior only occurs when the code is inside a cabal
project. This seems weird to me. Could this have something to do with the setup of my cabal
project?.
来源:https://stackoverflow.com/questions/42330183/evaluation-of-higher-order-functions