aggregate

aggregate a matrix (or data.frame) by column name groups in R

筅森魡賤 提交于 2019-12-10 09:43:08
问题 I have a large matrix with about 3000 columns x 3000 rows. I'd like to aggregate (calculate the mean) grouped by column names for every row. Each column is named similar to this method...(and in random order) Tree Tree House House Tree Car Car House I would need the data result (aggregation of mean of every row) to have the following columns: Tree House Car the tricky part (at least for me) is that I do not know all the column names and they are all in random order! 回答1: You could try res1 <-

R: Sum Complete.cases in one column grouped by (or sorted by) a value in another column

萝らか妹 提交于 2019-12-10 09:31:51
问题 I'm using the airquality data set available in R, and attempting to count the number of rows within the data that do not contain any NA s, while aggregating by Month . The data looks like this: head(airquality) # Ozone Solar.R Wind Temp Month Day # 1 41 190 7.4 67 5 1 # 2 36 118 8.0 72 5 2 # 3 12 149 12.6 74 5 3 # 4 18 313 11.5 62 5 4 # 5 NA NA 14.3 56 5 5 # 6 28 NA 14.9 66 5 6 As you can see, I have NA s in columns Ozone and Solar.R . I used the function complete.cases as follows: x <-

How to calculate new column depending on aggregate function on group using dplyr (add summary statistics on the summary statistics)?

眉间皱痕 提交于 2019-12-10 04:30:01
问题 Quite often I need to calculate a new column for an R dataframe (in long form), whose value should depend on an aggregate function (e.g. sum) of a group. For instance, I might want to know what fraction of sales a product accounts for on any given day: daily fraction = revenue for product i on day d / sum or revenue for all products on day d My current strategy is to summarise and join: library(dplyr) join_summary <- function(data, ...) left_join(data, summarise(data, ...)) data = data.frame(

Do std::tuple and std::pair support aggregate initialization?

最后都变了- 提交于 2019-12-10 03:37:04
问题 Aggregate initialization requires among other things no user-provided constructors . But std::tuple and std::pair pair have a large set of overloaded constructors. From the point of the core language, are these constructors user-provided or even user-declared ? With C++17 it will be possible to write (update/clarification: where nocopy is a class that can not be copied or moved, such as std::mutex ) auto get_ensured_rvo_str(){ return std::pair(std::string(),nocopy()); } edit: no, it's not

Use something like TOP with GROUP BY

别来无恙 提交于 2019-12-10 03:29:10
问题 With table table1 like below +--------+-------+-------+------------+-------+ | flight | orig | dest | passenger | bags | +--------+-------+-------+------------+-------+ | 1111 | sfo | chi | david | 3 | | 1112 | sfo | dal | david | 7 | | 1112 | sfo | dal | kim | 10| | 1113 | lax | san | ameera | 5 | | 1114 | lax | lfr | tim | 6 | | 1114 | lax | lfr | jake | 8 | +--------+-------+-------+------------+-------+ I'm aggregating the table by orig like below select orig , count(*) flight_cnt , count

Conditional calculating Maximum value in the column

人盡茶涼 提交于 2019-12-09 23:45:44
问题 I have the following table: Class x2 x3 x4 A 14 45 53 A 8 18 17 A 16 49 20 B 78 21 48 B 8 18 5 I need for each "Class" (A and B) find the maximum value in column "X3", keep that row and delete other rows. The output should be in format like: Class x2 x3 x4 A 14 49 20 B 78 21 48 Please, ask me questions if something unclear in my problem. Thank you! 回答1: A base R approach could be: mydf[as.logical(with(mydf, ave(x3, Class, FUN = function(x) x == max(x)))), ] # Class x2 x3 x4 # 3 A 16 49 20 # 4

how to aggregate this data in R

落花浮王杯 提交于 2019-12-09 18:22:31
问题 I have a data frame in R with the following structure. > testData date exch.code comm.code oi 1 1997-12-30 CBT 1 468710 2 1997-12-23 CBT 1 457165 3 1997-12-19 CBT 1 461520 4 1997-12-16 CBT 1 444190 5 1997-12-09 CBT 1 446190 6 1997-12-02 CBT 1 443085 .... 77827 2004-10-26 NYME 967 10038 77828 2004-10-19 NYME 967 9910 77829 2004-10-12 NYME 967 10195 77830 2004-09-28 NYME 967 9970 77831 2004-08-31 NYME 967 9155 77832 2004-08-24 NYME 967 8655 What I want to do is produce a table the shows for a

Can an aggregates invariant include a rule based on information from elsewhere?

徘徊边缘 提交于 2019-12-09 16:04:32
问题 In DDD can an aggregates invariant include a rule based on information in a another aggregate? Now I don't think so, however this causes me a problem and I don't know how to solve it. I have an entity called Asset (equipment) which I'm modelling as the root of an aggregate. It has a list of Tags (properties) that describe things like Manufacturer, Model etc. It stores the identity of second aggregate called AssetType which has a list of TagTypes, of which some can be marked as mandatory. Now

Aggregate data in one column based on values in another column

家住魔仙堡 提交于 2019-12-09 15:39:36
问题 I know there is an easy way to do this...but, I can't figure it out. I have a dataframe in my R script that looks something like this: A B C 1.2 4 8 2.3 4 9 2.3 6 0 1.2 3 3 3.4 2 1 1.2 5 1 Note that A, B, and C are column names. And I'm trying to get variables like this: sum1 <- [the sum of all B values such that A is 1.2] num1 <- [the number of times A is 1.2] Any easy way to do this? I basically want to end up with a data frame that looks like this: A num totalB 1.2 3 12 etc etc etc Where

unexpected output from aggregate

﹥>﹥吖頭↗ 提交于 2019-12-09 15:27:21
问题 While experimenting with aggregate for another question here, I encountered a rather strange result. I'm unable to figure out why and am wondering if what I'm doing is totally wrong. Suppose, I have a data.frame like this: df <- structure(list(V1 = c(1L, 2L, 1L, 2L, 3L, 1L), V2 = c(2L, 3L, 2L, 3L, 4L, 2L), V3 = c(3L, 4L, 3L, 4L, 5L, 3L), V4 = c(4L, 5L, 4L, 5L, 6L, 4L)), .Names = c("V1", "V2", "V3", "V4"), row.names = c(NA, -6L), class = "data.frame") > df # V1 V2 V3 V4 # 1 1 2 3 4 # 2 2 3 4 5