How to get the position of the row with some key from a Deedle Frame<DateTime,_>?

大城市里の小女人 提交于 2021-01-29 02:49:11

问题


By position I mean:

let position:int = positionForKey frame key
let row =
  Frame.take positionForKey
  |> frame.takeLast 1

Then, row should be a Frame with only one row, whose key is key.

What I don't know is how to achieve positionForKey. One idea that should work but I don't know if it's the best way of doing it would be to create another Series via Series.scanValues and let the values be the positions, but I think there oughts to be a more elegant way of doing it.

The implementation via Series.scanValues would be:

let positionForKey (frame:Frame<'K,_>) (key:'K) =
  let positions = Series.scanValues (fun pos _ -> pos + 1) 0 (frame.GetColumnAt 0)
  positions.[key]

... index beginning from 1

Example

Say you have a Frame f like this:

03/01/01,  4 , ...
04/01/01,  3 , ...
05/01/01,  6 , ...
   ...  , ..., ...

then, positionforKey f 04/01/01 = 2, positionforKey f 05/01/01 = 3 and so on. (Supposing that 04/01/01 was a valid DateTime)


回答1:


Deedle actually has built-in function for doing this, but they are not very well documented (mostly because this has been changing quite a bit when we were adding support for "virtual frames").

However, consider a sample data frame:

let ts = series [ for i in 0 .. 365 -> DateTime(2017, 1, 1).AddDays(float i) => float i]
let df = frame ["Sample" => ts ]

The data frame has a row index which represents how the lookup using indices is performed. Using the RowIndex, you can locate the key and then translate the returned address to an index:

let addr = df.RowIndex.Locate(DateTime(2017, 5, 1))
let idx = df.RowIndex.AddressOperations.OffsetOf(addr)

And then you can get a frame with just this row:

df.GetRowsAt([| int idx |])

The address addr is just the index when you are working with in-memory data frames, but in virtual data frames it would be a number that encodes where the row is stored and so it would not directly map to an offset. That's why I added the OffsetOf call, which maps the address to an actual index. Though in case of in-memory frames, you do not need to worry about this.

If the key is not found, the addr value will be -1L (though in principle, you should use Addressing.Address.invalid when checking for this).




回答2:


You can extract the position of the key in several ways, for example using .RowIndex. But the simplest way is probably just get the keys and find the index. You might want to use TryFindIndex, where df is a dataframe, indexed by DateTime.

df.RowKeys |> Seq.findIndex(fun x -> x = DateTime(2017,5,6))

If you just want to return a row at the specified index, there is an extension method for that. Here are some ways to get at the row by index:

(Frame.getRow (DateTime(2017,5,6)) df):Series<string,string>

or

df.Rows.[(DateTime(2017,5,6))]

If you want to do something fancier you should certainly consult the Deedle, and Frame docs.



来源:https://stackoverflow.com/questions/41883194/how-to-get-the-position-of-the-row-with-some-key-from-a-deedle-framedatetime

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!