Map to Deedle Frame

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 22:15:03

问题


I am learning F#. I am trying to convert a Map<string, seq<DateTime * float>> to a Deedle dataframe (http://bluemountaincapital.github.io/Deedle/tutorial.html#creating).

I have prapared the following code:

let folderFnct (aFrame:Frame) colName datesAndValues =
    let newSerie = Series(Seq.map (fun x -> fst x) datesAndValues, Seq.map (fun y -> snd y) datesAndValues)
    let newFrame = aFrame.Join([colName], [newSerie], kind=JoinKind.Inner)
    newFrame


let mapToDeedleFrame myMap frame =       
    Map.fold ( fun s ticker datesAndValues -> folderFnct s ticker datesAndValues) frame myMap

mapToDeedleFrame folds the map using an existing frame. The folder function folderFnct:

  • takes the frame
  • uses the Map key as column name in the frame, and
  • processes the values (<DateTime * float>) making a Series of them.

The problem is with:

let newFrame = aFrame.Join([colName], [newSerie], kind=JoinKind.Inner)

where:

The field, constructor or member 'Join' is not defined

I have identified three potential causes of the issue:

  1. Why is aFrame.Join not defined? I tried explicitly specifying the type of aFrame
  2. How can I feed to mapToDeedleFrame an empty frame?
  3. Should I pattern match in folderFnct against the case where aFrame is empty?

Thanks a lot!

EDIT 1

Based on Tomas suggestion, this is what I have cranked out so far.

let folderFnct (aFrame:Frame<'a, 'b>) columnName (seqOfTuples: seq<'a*'b>) =
    let newSerie = Series(Seq.map (fun x -> fst x) seqOfTuples, Seq.map (fun y -> snd y) seqOfTuples)
    let otherFrame = Frame([columnName], [newSerie])
    let newFrame = aFrame.Join((otherFrame), kind=JoinKind.Inner)
    newFrame


let mapToDeedleFrame myMap frame =       
    Map.fold ( fun state k vals -> folderFnct state k vals) frame myMap

The last step missing is: how do I quickly pass an empty Frame (maybe avoiding creating a dummy one) to mapToDeedleFrame? I have tried [] as in

let frame = mapToDeedleFrame mapTS []

This may be a silly question, but I am new to F# and I was wondering if there is an Empty type built in the language.

FOLLOW UP QUESTION

In the source file I read (https://github.com/BlueMountainCapital/Deedle/blob/master/src/Deedle/Frame.fs):

  member frame.Join<'V>(colKey, series:Series<'TRowKey, 'V>, kind, lookup) =    
    let otherFrame = Frame([colKey], [series])
    frame.Join(otherFrame, kind, lookup)

while in the function description popping out on the screen:

From the picture above I would guess that the type of the Frame is the same as colKey, while, as I understood, colKey is just the key to the dataframe column added with the join from the serie. As a complete noob, I am quite confused..

EDIT 2

I have rewritten the code:

let seriesListMapper (colName:string, series:Series<'a, 'b>) = 
    [colName => series] |> frame


let frameListReducer (accFrame: Frame<'a, 'b>) (aFrame: Frame<'a, 'b>) =
     accFrame.Join(aFrame, kind=JoinKind.Outer)


let seriesListToFrame (seriesList: List<string * Series<'a, 'b>>) =
    seriesList |> List.map (fun elem -> seriesListMapper elem) |> List.reduce(fun acc elem -> frameListReducer acc elem)

The problem is that:

let frame = seriesListToFrame seriesList

returns frame as Frame, while seriesList is instead (string *Series<DateTime, float>) list

I think that the problem is with:

let seriesListMapper (colName:string, series:Series<'a, 'b>) = 
    [colName => series] |> frame

In fact seriesListMapper is indicated as

seriesListMapper: colName:string * series:Series<'a, 'b> -> Frame<'a, string>

I do not understand how and why the values are converted to string from float.

One interesting thing is that plotting the frame with frame.Format() actually confirms that the data looks correct. It is just this "strange" conversion to string.


回答1:


In the type annotation of the folderFnct, you have aFrame:Frame. However, the type representing data frames is a generic type with two type arguments (representing the type of index for rows and columns, respectively), so the annotation should be aFrame:Frame<_, _>.

Another way to add series to a frame is to use mutating operation:

aFrame.AddSeries(colName, newSeries)

However, this only supports left join (data frame can only be mutated by adding new series, but not in a way that would change the index). However, you might be able to use this approach and then drop all missing values from the frame once it is constructed.

EDIT: To answer the question about generic types:

  • Series<K, V> represents series with keys of type K containing values of type V (e.g. series with ordinarily indexed observations would have K=int and V=float)

  • Frame<R, C> represents a frame that contains heterogeneous data (of potentially varying types for each column) where the rows are indexed by R and columns are indexed by C. For ordinarily indexed frame R=int and typically, your columns will be named so C=string (but you can have other indices too)



来源:https://stackoverflow.com/questions/19795949/map-to-deedle-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!