One nice feature of DataFrames is that it can store columns with different types and it can \"auto-recognise\" them, e.g.:
using DataFrames, DataStructures
df1
mat2df(mat) =
DataFrame([[mat[2:end,i]...] for i in 1:size(mat,2)], Symbol.(mat[1,:]))
Seems to work and is faster than @dan-getz's answer (at least for this data matrix) :)
using DataFrames, BenchmarkTools
dataMatrix = [
"parName" "region" "forType" "value";
"vol" "AL" "broadL_highF" 3.3055628012;
"vol" "AL" "con_highF" 2.1360975151;
"vol" "AQ" "broadL_highF" 5.81984502;
"vol" "AQ" "con_highF" 8.1462998309;
]
mat2df(mat) =
DataFrame([[mat[2:end,i]...] for i in 1:size(mat,2)], Symbol.(mat[1,:]))
function mat2dfDan(mat)
s = join([join([dataMatrix[i,j] for j in indices(dataMatrix, 2)], '\t')
for i in indices(dataMatrix, 1)],'\n')
DataFrames.inlinetable(s; separator='\t', header=true)
end
-
julia> @benchmark mat2df(dataMatrix)
BenchmarkTools.Trial:
memory estimate: 5.05 KiB
allocs estimate: 75
--------------
minimum time: 18.601 μs (0.00% GC)
median time: 21.318 μs (0.00% GC)
mean time: 31.773 μs (2.50% GC)
maximum time: 4.287 ms (95.32% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark mat2dfDan(dataMatrix)
BenchmarkTools.Trial:
memory estimate: 17.55 KiB
allocs estimate: 318
--------------
minimum time: 69.183 μs (0.00% GC)
median time: 81.326 μs (0.00% GC)
mean time: 90.284 μs (2.97% GC)
maximum time: 5.565 ms (93.72% GC)
--------------
samples: 10000
evals/sample: 1