Julia dataframe where a column is an array of arrays?

不打扰是莪最后的温柔 提交于 2020-01-04 06:11:32

问题


I'm trying to create a table where each row has time-series data associated with a particular test-case.

julia> df = DataFrame(var1 = Int64[], var2 = Int64[], ts = Array{Array{Int64, 1}, 1})
0x3 DataFrames.DataFrame

I'm able to create the data frame. Each var1, var2 pair is intended to have an associated time series.

I want to generate data in a loop and want to append to this dataframe using push!

I've tried

julia> push!(df, [1, 2, [3,4,5]])
ERROR: ArgumentError: Length of iterable does not match DataFrame column count.
  in push! at /Users/stro/.julia/v0.4/DataFrames/src/dataframe/dataframe.jl:871

and

julia> push!(df, (1, 2, [3,4,5]))
ERROR: ArgumentError: Error adding [3,4,5] to column :ts. Possible type mis-match.
 in push! at /Users/stro/.julia/v0.4/DataFrames/src/dataframe/dataframe.jl:883

What's the best way to go about this? Is my intended approach even the right path?


回答1:


You've accidentally put the type of a vector in instead of an actual vector. This declaration will work:

df = DataFrame(var1 = Int64[], var2 = Int64[], ts = Array{Int64, 1}[])

Note the change from Array{Array{Int64, 1}, 1}, which is a type, to Array{Int64, 1}[], which is an actual vector with that type.

Then things work:

julia> push!(df, (1, 2, [3,4,5]))

julia> df
1x3 DataFrames.DataFrame
│ Row │ var1 │ var2 │ ts      │
┝━━━━━┿━━━━━━┿━━━━━━┿━━━━━━━━━┥
│ 1   │ 1    │ 2    │ [3,4,5] │

Note that your other example, using [1, 2, [3,4,5]] still does not work. This is because a quirk in Julia's array syntax means that the comma , operator does concatenation, so in fact [1, 2, [3,4,5]] means [1, 2, 3, 4, 5]. This behaviour is weird and will be fixed in Julia 0.5, but is preserved in 0.4 for backwards compatibility.



来源:https://stackoverflow.com/questions/36729891/julia-dataframe-where-a-column-is-an-array-of-arrays

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!