F#: removing duplicates from a seq is slow

后端 未结 9 1245
半阙折子戏
半阙折子戏 2021-01-12 03:11

I am attempting to write a function that weeds out consecutive duplicates, as determined by a given equality function, from a seq<\'a> but with a twist:

9条回答
  •  滥情空心
    2021-01-12 03:56

    The problem is with how you use sequences. All those yields, heads and tails are spinning a web of iterators branching off of iterators, and when you finally materialize it when you call List.ofSeq, you're iterating through your input sequence way more than you should.

    Each of those Seq.heads is not simply taking the first element of a sequence - it's taking the first element of the tail of a sequence of a tail of a sequence of tail of a sequence and so on.

    Check this out - it'll count the times the element constructor is called:

    let count = ref 0
    
    Seq.init 1000 (fun i -> count := !count + 1; 1) 
    |> dedupeTakingLast (fun (x,y) -> x = y) None 
    |> List.ofSeq
    

    Incidentally, just switching out all the Seqs to Lists makes it go instantly.

提交回复
热议问题