How to convert a mixed-type Matrix to DataFrame in Julia recognising the column types

前端 未结 4 2186
深忆病人
深忆病人 2021-01-26 02:50

One nice feature of DataFrames is that it can store columns with different types and it can \"auto-recognise\" them, e.g.:

using DataFrames, DataStructures

df1          


        
4条回答
  •  死守一世寂寞
    2021-01-26 03:08

    mat2df(mat) = 
        DataFrame([[mat[2:end,i]...] for i in 1:size(mat,2)], Symbol.(mat[1,:]))
    

    Seems to work and is faster than @dan-getz's answer (at least for this data matrix) :)

    using DataFrames, BenchmarkTools
    
    dataMatrix = [
        "parName"   "region"    "forType"       "value";
        "vol"       "AL"        "broadL_highF"  3.3055628012;
        "vol"       "AL"        "con_highF"     2.1360975151;
        "vol"       "AQ"        "broadL_highF"  5.81984502;
        "vol"       "AQ"        "con_highF"     8.1462998309;
    ]
    
    mat2df(mat) = 
        DataFrame([[mat[2:end,i]...] for i in 1:size(mat,2)], Symbol.(mat[1,:]))
    
    function mat2dfDan(mat)
        s = join([join([dataMatrix[i,j] for j in indices(dataMatrix, 2)], '\t') 
                    for i in indices(dataMatrix, 1)],'\n')
    
        DataFrames.inlinetable(s; separator='\t', header=true)
    end
    

    -

    julia> @benchmark mat2df(dataMatrix)
    
    BenchmarkTools.Trial: 
      memory estimate:  5.05 KiB
      allocs estimate:  75
      --------------
      minimum time:     18.601 μs (0.00% GC)
      median time:      21.318 μs (0.00% GC)
      mean time:        31.773 μs (2.50% GC)
      maximum time:     4.287 ms (95.32% GC)
      --------------
      samples:          10000
      evals/sample:     1
    
    julia> @benchmark mat2dfDan(dataMatrix)
    
    BenchmarkTools.Trial: 
      memory estimate:  17.55 KiB
      allocs estimate:  318
      --------------
      minimum time:     69.183 μs (0.00% GC)
      median time:      81.326 μs (0.00% GC)
      mean time:        90.284 μs (2.97% GC)
      maximum time:     5.565 ms (93.72% GC)
      --------------
      samples:          10000
      evals/sample:     1
    

提交回复
热议问题