Splitting a list into list of lists based on predicate

前端 未结 5 2075
感动是毒
感动是毒 2021-01-21 03:28

(I am aware of this question, but it relates to sequences, which is not my problem here)

Given this input (for example):

let testlist = 
    [  
       \         


        
5条回答
  •  灰色年华
    2021-01-21 03:53

    Edit: rev-less version using foldBack added below.

    Here's some code that uses lists and tail-recursion:

    //divides a list L into chunks for which all elements match pred
    let divide pred L =
        let rec aux buf acc L =
            match L,buf with
            //no more input and an empty buffer -> return acc
            | [],[] -> List.rev acc 
            //no more input and a non-empty buffer -> return acc + rest of buffer
            | [],buf -> List.rev (List.rev buf :: acc) 
            //found something that matches pred: put it in the buffer and go to next in list
            | h::t,buf when pred h -> aux (h::buf) acc t
            //found something that doesn't match pred. Continue but don't add an empty buffer to acc
            | h::t,[] -> aux [] acc t
            //found input that doesn't match pred. Add buffer to acc and continue with an empty buffer
            | h::t,buf -> aux [] (List.rev buf :: acc) t
        aux [] [] L
    

    usage:

    > divide pred testlist;;
    val it : string list list =
      [["*text1"; "*text2"]; ["*text5"; "*text6"; "*text7"]]
    

    Using a list as data structure for a buffer means that it always needs to be reversed when outputting the contents. This may not be a problem if individual chunks are modestly sized. If speed/efficiency becomes an issue, you could use a Queue<'a> or a `List<'a>' for the buffers, for which appending is fast. But using these data structures instead of lists also means that you lose the powerful list pattern matching. In my opinion, being able to pattern match lists outweighs the presence of a few List.rev calls.

    Here's a streaming version that outputs the result one block at a time. This avoids the List.rev on the accumulator in the previous example:

    let dividestream pred L =
        let rec aux buf L =
            seq { match L, buf with
                  | [],[] -> ()
                  | [],buf -> yield List.rev buf
                  | h::t,buf when pred h -> yield! aux (h::buf) t
                  | h::t,[] -> yield! aux [] t
                  | h::t,buf -> yield List.rev buf
                                yield! aux [] t }
        aux [] L
    

    This streaming version avoids the List.rev on the accumulator. Using List.foldBack can be used to avoid reversing the accumulated chunks as well.

    update: here's a version using foldBack

    //divides a list L into chunks for which all elements match pred
    let divide2 pred L =
        let f x (acc,buf) =
            match pred x,buf with
            | true,buf -> (acc,x::buf)
            | false,[] -> (acc,[])
            | false,buf -> (buf::acc,[])
    
        let rest,remainingBuffer = List.foldBack f L ([],[])
        match remainingBuffer with
        | [] -> rest
        | buf -> buf :: rest
    

提交回复
热议问题