group-by

Reshaping data.frame with a by-group where id variable repeats [duplicate]

蹲街弑〆低调 提交于 2020-06-09 05:37:25
问题 This question already has answers here : How to reshape data from long to wide format (11 answers) Closed 21 days ago . I want to reshape/ rearrange a dataset, that is stored as a data.frame with 2 columns: id (non-unique, i.e. can repeat over several rows) --> stored as character value --> stored as numeric value (range 1:3) Sample data: id <- as.character(1001:1003) val_list <- data.frame(sample(1:3, size=12, replace=TRUE)) have <- data.frame(cbind(rep(id, 4), val_list)) colnames(have) <- c

MySQL: select, group by and transform rows to separate columns :)

本小妞迷上赌 提交于 2020-06-01 06:40:31
问题 I need to ask you for help with MySQL select query. Specific example: employees with the spouse and kids. I have 2 tables already joined into one and now I need to: 1, select the data with grouping them by 'emp' field 2, transform the result with these rules: only one row with particular emp (emp-A, emp-B, emp-C) each relative (spouse and kids) in succeeding columns (spouse first, kids next) The table (in fact two joined tables): +---------+-----------+-----------+------------+ | emp |

MySQL query - join 4 tables together, with 3 tables using group by one column from each

我的梦境 提交于 2020-05-31 04:02:31
问题 Here are examples of the 4 tables I'm working with. Items +----+------+ | id | name | +----+------+ | 1 | abc | | 2 | def | | 3 | ghi | +----+------+ Buy Table +----+-------------+-----+---------+ | id | date | qty | item_id | +----+-------------+-----+---------+ | 1 | 2020-05-01 | 10 | 1 | | 2 | 2020-05-02 | 20 | 2 | | 3 | 2020-05-03 | 5 | 3 | +----+-----------+-------+---------+ Rent Table +----+-------------+-----+---------+ | id | date | qty | item_id | +----+-------------+-----+---------

MySQL query - join 4 tables together, with 3 tables using group by one column from each

余生长醉 提交于 2020-05-31 04:01:17
问题 Here are examples of the 4 tables I'm working with. Items +----+------+ | id | name | +----+------+ | 1 | abc | | 2 | def | | 3 | ghi | +----+------+ Buy Table +----+-------------+-----+---------+ | id | date | qty | item_id | +----+-------------+-----+---------+ | 1 | 2020-05-01 | 10 | 1 | | 2 | 2020-05-02 | 20 | 2 | | 3 | 2020-05-03 | 5 | 3 | +----+-----------+-------+---------+ Rent Table +----+-------------+-----+---------+ | id | date | qty | item_id | +----+-------------+-----+---------

Get weighted average summary data column in new pandas dataframe from existing dataframe based on other column-ID

只谈情不闲聊 提交于 2020-05-30 08:00:06
问题 Somewhat similar question to an earlier question I had here: Get summary data columns in new pandas dataframe from existing dataframe based on other column-ID However, instead of just taking the sum of datapoints, I wanted to have the weighted average in an extra column. I'll repeat and rephrase the question: I want to summarize the data in a dataframe and add the new columns to another dataframe. My data contains appartments with an ID-number and it has surfaces and U-values for each room in

How to groupby and pivot a dataframe with non-numeric values

无人久伴 提交于 2020-05-29 11:38:10
问题 I'm using Python, and I have a dataset of 6 columns, R, Rc, J, T, Ca and Cb. I need to "aggregate" on the columns "R" then "J", so that for each R, each row is a unique "J". Rc is a characteristic of R. Ca and Cb are characteristics of T. It will make more sense looking at the table below. I need to go from: #______________________ ________________________________________________________________ #| R Rc J T Ca Cb| |# R Rc J Ca(T=1) Ca(T=2) Ca(T=3) Cb(T=1) Cb(T=2) Cb(T=3)| #| a p 1 1 x d| |# a

How to groupby and pivot a dataframe with non-numeric values

我们两清 提交于 2020-05-29 11:37:48
问题 I'm using Python, and I have a dataset of 6 columns, R, Rc, J, T, Ca and Cb. I need to "aggregate" on the columns "R" then "J", so that for each R, each row is a unique "J". Rc is a characteristic of R. Ca and Cb are characteristics of T. It will make more sense looking at the table below. I need to go from: #______________________ ________________________________________________________________ #| R Rc J T Ca Cb| |# R Rc J Ca(T=1) Ca(T=2) Ca(T=3) Cb(T=1) Cb(T=2) Cb(T=3)| #| a p 1 1 x d| |# a

Rowwise operation with adaptive range using dplyr

老子叫甜甜 提交于 2020-05-28 04:53:36
问题 Based on my earlier question, I would like to calculate colocation (i.e. two people appearing at the same time) instances given a smartcard data. Here is a made-up sample consisting of ten records: library(lubridate) smartcard <- c(1,2,3,2,1,2,4,4,1,1) boarding_stop <- c("C23", "C14", "C23", "C23", "C23", "C14", "C14", "C23", "C14", "C23") boarding_time <- as.times(c("07:24:01", "07:26:18", "07:37:19", "08:29:22", "08:34:10", "15:55:23", "16:20:22", "17:07:31", "17:13:34", "17:35:52"))

Finding max occurrence of a column's value, after group-by on another column

老子叫甜甜 提交于 2020-05-27 06:45:07
问题 I have a pandas data-frame: id city 000.tushar@gmail.com Bangalore 00078r@gmail.com Mumbai 0007ayan@gmail.com Jamshedpur 0007ayan@gmail.com Jamshedpur 000.tushar@gmail.com Bangalore 00078r@gmail.com Mumbai 00078r@gmail.com Vijayawada 00078r@gmail.com Vijayawada 00078r@gmail.com Vijayawada I want to find id-wise the maximum occurring city name. So that for a given id I can tell that - this is his favorite city: id city 000.tushar@gmail.com Bangalore 00078r@gmail.com Vijayawada 0007ayan@gmail

Count groups of consecutive 1s in pandas

寵の児 提交于 2020-05-26 09:51:08
问题 I have a list of '1's and '0s' and I would like to calculate the number of groups of consecutive '1's. mylist = [0,0,1,1,0,1,1,1,1,0,1,0] Doing it by hand gives us 3 groups but is there a way to do it by python? 回答1: Option 1 With pandas . First, initialise a dataframe: In [78]: df Out[78]: Col1 0 0 1 0 2 1 3 1 4 0 5 1 6 1 7 1 8 1 9 0 10 1 11 0 Now calculate sum total by number of groups: In [79]: df.sum() / df.diff().eq(1).cumsum().max() Out[79]: Col1 2.333333 dtype: float64 If you want just