F# how to Window a sequence based on predicate rather than fixed length

两盒软妹~` 提交于 2019-12-10 15:57:11

问题


Given the following input sequence, I would like to generate the desired output. I know that Seq.window can be used to almost get the desired result if all the windows are a fixed length. However in this case they are not fixed legnth, I would like to start a new sequence whenever "a" is encountered. Is this possible with the standard collections library?

let inputSequence = 
      ["a"; "b"; "c";
       "a"; "b"; "c"; "d";
       "a"; "b"; 
       "a"; "d"; "f";
       "a"; "x"; "y"; "z"]

let desiredResult = 
   [["a"; "b"; "c";]
    ["a"; "b"; "c"; "d";]
    ["a"; "b"; ]
    ["a"; "d"; "f";]
    ["a"; "x"; "y"; "z"]]

回答1:


Here's a way that uses mutable state but is pretty concise:

let mutable i = 0
[ for x in inputSequence do
    if x = "a" then i <- i + 1
    yield i, x ]
|> List.groupBy fst
|> List.map snd
|> List.map (List.map snd)



回答2:


As mentioned in the other answer, you can fairly easily implement this using recursion or using fold. To make the recursive version more useful, you can define a function chunkAt that creates a new chunk when the list contains a specific value:

let chunkAt start list = 
  let rec loop chunk chunks list = 
    match list with
    | [] -> List.rev ((List.rev chunk)::chunks)
    | x::xs when x = start && List.isEmpty chunk -> loop [x] chunks xs
    | x::xs when x = start -> loop [x] ((List.rev chunk)::chunks) xs
    | x::xs -> loop (x::chunk) chunks xs
  loop [] [] list

Then you can run it on your input sequence using:

chunkAt "a" inputSequence

Although there is no standard library function doing this, you can use the data series manipulation library Deedle, which implements a fairly rich set of chunking functions. To do this using Deedle, you can turn your sequence into a series indexed by ordinal index and then use:

let s = Series.ofValues inputSequence
let chunked = s |> Series.chunkWhile (fun _ k2 -> s.[k2] <> "a")

If you wanted to turn data back to a list, you can use the Values property of the returned series:

chunked.Values |> Seq.map (fun s -> s.Values)



回答3:


Unfortunately despite its FP heritage F# lacks some common list manipulation functions. Splitting/partitioning based on a predicate being one. You can probably implement this using recursion, so fold. However here it is if you just want apply the library functions:

let inputSequence = 
      ["a"; "b"; "c";
       "a"; "b"; "c"; "d";
       "a"; "b"; 
       "a"; "d"; "f";
       "a"; "x"; "y"; "z"]

let aIdx = 
    inputSequence 
        |> List.mapi (fun i x -> i, x) //find the index of a elements
        |> List.filter (fun x -> snd x = "a")
        |> List.map fst //extract it into a list

[List.length inputSequence] 
    |> List.append aIdx //We will need the last "a" index, and the end of the list
    |> List.pairwise //begin and end index
    |> List.map (fun (x,y) -> inputSequence.[x .. (y - 1)]) 

//val it : string list list =
[["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"]; ["a"; "d"; "f"];
["a"; "x"; "y"; "z"]]



回答4:


This answer has pretty much the same mechanism as the one provided by @TheQuickBrownFox but it doesn't use a mutable:

inputSequence 
|> List.scan (fun i x -> if x = "a" then i + 1 else i) 0 
|> List.tail
|> List.zip inputSequence 
|> List.groupBy snd
|> List.map (snd >> List.map fst)

In case you want to use a library, in addition to the one suggested by @Tomas, F#+ provides some basic split functions that allows to compose your function like this:

let splitEvery x = 
    List.split (seq [[x]]) >> Seq.map (List.cons x) >> Seq.tail >> Seq.toList

and there is a proposal to include these types of functions in F# core, worths reading the discussion.




回答5:


Here is a short one:

let folder (str: string) ((xs, xss): list<string> * list<list<string>>) =
    if str = "a" then ([], ((str :: xs) :: xss))
    else (str :: xs, xss)

List.foldBack folder inputSequence ([], [])
|> snd

// [["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"]; ["a"; "d"; "f"]; ["a"; "x"; "y"; "z"]]

This satisfies the specifications in the question: I would like to start a new sequence whenever "a" is encountered, since any initial strings before the first "a" will be ignored. For example, for

let inputSequence = 
      ["r"; "s";
       "a"; "b"; "c";
       "a"; "b"; "c"; "d";
       "a"; "b"; 
       "a"; "d"; "f";
       "a"; "x"; "y"; "z"]

one gets the same result as above.

If one needs to capture the initial strings before the first "a" the following can be used:

match inputSequence |> List.tryFindIndex (fun x -> x = "a") with
| None -> [inputSequence]
| Some i -> (List.take i inputSequence) :: 
            (List.foldBack folder (List.skip i inputSequence) ([], []) |> snd)

// [["r"; "s"]; ["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"];
   ["a"; "d"; "f"]; ["a"; "x"; "y"; "z"]]


来源:https://stackoverflow.com/questions/46389854/f-how-to-window-a-sequence-based-on-predicate-rather-than-fixed-length

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!