How to Consolidate Data from Multiple Excel Columns All into One Column

前端 未结 6 1159
慢半拍i
慢半拍i 2020-12-09 10:40

Lets say I have an excel sheet with 4 columns of data & 20,000 rows of data in each column.

What is the most efficient way to get it so that I have all of that d

6条回答
  •  情书的邮戳
    2020-12-09 11:06

    You didn't mention if you are using Excel 2003 or 2007, but you may run into an issue with the # of rows in Excel 2003 being capped at 65,536. If you are using 2007, the limit is 1,048,576.

    Also, can I ask what your end goal is for your analysis? If you need to perform many statistical calculations on your data, I would recommend moving out of the Excel environment into something that is more directly suited for data manipulation and analysis, such as R.

    There are a variety of options for connecting R to Excel, including

    1. RExcel
    2. RODBC
    3. Other options in the R manual

    Regardless of what you choose to use to move data in/out of R, the code to change from wide to long format is pretty trivial. I enjoy the melt() function from the reshape package. That code would look like:

    library(reshape)
    #Fake data, 4 columns, 20k rows
    df <- data.frame(foo = rnorm(20000)
        , bar = rlnorm(20000)
        , fee = rnorm(20000)
        , fie = rlnorm(20000)
    )
    #Create new object with 1 column, 80k rows
    df.m <- melt(df)
    

    From there, you can perform any number of statistical or graphing operations. If you use the RExcel plugin above, you can fire all of this up and run it within Excel itself. The R community is very active and can help address any and all questions you may encounter.

    Good luck!

提交回复
热议问题