F# Removing duplicates from list with function

我的梦境 提交于 2021-02-08 05:05:31

问题


I want to create a function that takes a list and returns a list with removed duplicates.

let removedupes list1 =
  let list2 = []
  let rec removeduprec list1 list2 =
    match list1 with
    | [] -> list2
    | head :: tail when mem list2 head = false -> head :: removeduprec tail list2
    | _ -> removeduprec list1.Tail list2
  removeduprec list1 list2

Im using this "mem" function to go trough the list and see if the value already exists and in that case i want to continue with the recursion.

let rec mem list x = 
  match list with
  | [] -> false
  | head :: tail -> 
    if x = head then true else mem tail x 

When i test this code i get

let list1 =  [ 1; 2; 3; 4; 5; 2; 2; 2]
removedups list1;;
val it : int list = [1; 2; 3; 4; 5; 2; 2; 2]

Im thinking that the "head :: removeduprec tail list2", but im quite new to f# so not completely sure how this works.


回答1:


I rewrote some of the logic to make things simpler. The problem was that you needed to add things to list2 as it was created, rather than afterwards - I moved the :: to inside the call like so

let rec mem list x =
  match list with
  | [] -> false
  | head :: tail ->
    if x = head then true else mem tail x

let removedupes list1 =
  let rec removeduprec list1 list2 =
    match list1 with
    | [] -> list2
    | head :: tail when mem list2 head = false -> removeduprec tail (head::list2)
    | h::t -> removeduprec t list2
  removeduprec list1 []



回答2:


A complementary to stackoverflow.com/questions/6842466 and John's approaches; less idiomatic, but fast and obvious:

let removeDups is =
    let d = System.Collections.Generic.Dictionary()
    [ for i in is do match d.TryGetValue i with
                     | (false,_) -> d.[i] <- (); yield i
                     | _ -> () ]

It removes duplicates from list of 1000000 elements having 100000 possible different values by

 Real: 00:00:00.182, CPU: 00:00:00.171, GC gen0: 14, gen1: 1, gen2: 0

Update: following ildjarn's comment using HashSet in place of Dictionary boosts performance about twice amortized on the same data:

Real: 00:00:00.093, CPU: 00:00:00.093, GC gen0: 2, gen1: 1, gen2: 0

On the contrary, using the set literally as suggested on the same test case downsides performance 27x:

Real: 00:00:02.788, CPU: 00:00:02.765, GC gen0: 100, gen1: 21, gen2: 1



回答3:


Just for completeness: in F# 4.0 the List module now has the distinct function doing exactly what OP wants.

List.distinct [1; 2; 2; 3; 3; 3];;
val it : int list = [1; 2; 3;]



回答4:


The answer from John is probably what you are looking for - it shows an idiomatic functional way to solve the problem. However, if you do not want to implement the functionality yourself, the easiest way would be to turn the list into a set (which cannot contain duplicates) and then back to list:

let list1 = [ 1; 2; 3; 4; 5; 2; 2; 2]
let list2 = List.ofSeq (set list1)

This is probably the shortest solution :-) one difference from John's version is that this does not preserve the original ordering of the list (it actually sorts it).



来源:https://stackoverflow.com/questions/21151535/f-removing-duplicates-from-list-with-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!