aggregate

Combine data.frames summing up values of identical columns in R

谁说胖子不能爱 提交于 2019-12-09 15:08:16
问题 I have 3 data frames (rows: sites, columns:species name) of species abundances within sites. Row numbers are identical, but column numbers differ as not all species are in all three data frames. I would like to merge them into one data frame with abundances of identical species summed up. For example: data.frame1 Sp1 Sp2 Sp3 Sp4 site1 1 2 3 1 site2 0 2 0 1 site3 1 1 1 1 data.frame2 Sp1 Sp2 Sp4 site1 0 1 2 site2 1 2 0 site3 1 1 1 data.frame3 Sp1 Sp2 Sp5 Sp6 site1 0 1 1 1 site2 1 1 1 5 site3 2

What differs between post-filter and global aggregation for faceted search?

回眸只為那壹抹淺笑 提交于 2019-12-09 13:21:15
问题 A common problem in search interfaces is that you want to return a selection of results, but might want to return information about all documents. (e.g. I want to see all red shirts, but want to know what other colors are available). This is sometimes referred to as "faceted results", or "faceted navigation". the example from the Elasticsearch reference is quite clear in explaining why / how, so I've used this as a base for this question. Summary / Question: It looks like I can use both a

Android: Get highest value in column

眉间皱痕 提交于 2019-12-09 13:15:57
问题 I have an URL pointing to content and I need to get highest value contained in one of the columns. Is there any aggregate function that will accomplish that or do I have to do this manually? 回答1: If you're querying an Android content provider, you should be able to achieve this by passing MAX(COLUMN_NAME) in to the selection parameter of ContentResolver.query : getContentResolver().query(uri, projection, "MAX(COLUMN_NAME)", null, sortOrder); Where Uri is the address of the content provider.

Elegant way to solve ddply task with aggregate (hoping for better performance)

风格不统一 提交于 2019-12-09 11:15:48
问题 I would like to aggregate a data.frame by an identifier variable called ensg . The data frame looks like this: chromosome probeset ensg symbol XXA_00 XXA_36 XXB_00 1 X 4938842 ENSMUSG00000000003 Pbsn 4.796123 4.737717 5.326664 I want to compute the mean for each numeric column over rows with same ensg value. The problem here is that I would like to leave the other identity variables chromosome and symbol untouched as they are also the same for same ensg . In the end I would like to have a

How to add to a list using Linq's aggregate function C#

冷暖自知 提交于 2019-12-09 10:29:19
问题 I have a collection of objects of one type that I'd like to convert to a different type. This can be done easily with foreach, but I'd like to figure out how to use Linq's aggregate function to do it. The problem is all the Aggregate examples use types line string or int, which support the '+' operator. I'd like to have the accumulator type be a list, which doesn't support '+' semantics. Here's a quick example: public class DestinationType { public DestinationType(int A, int B, int C) { ... }

how to aggregate elements of a list of tuples if the tuples have the same first element?

微笑、不失礼 提交于 2019-12-09 08:59:12
问题 I have a list in which each value is a list of tuples. for example this is the value which I extract for a key : [('1998-01-20',8) , ('1998-01-22',4) , ('1998-06-18',8 ) , ('1999-07-15' , 7), ('1999-07-21',1) ] I have also sorted the list. now I want to aggregate the values like this : [('1998-01' , 12 ) , ('1998-06' ,8 ) , ('1999-07',8 )] in some sense I want to group my tuples in terms of month , to sum the ints for each month together , I have read about groupby and I think it can't help

Special grouping number for each pairs

浪子不回头ぞ 提交于 2019-12-09 03:54:19
问题 There is already some part of the question answered here special-group-number-for-each-combination-of-data. In most cases we have pairs and other data values inside the data. What we want to achieve is that number those groups if those pairs exist and number them until the next pairs. As I concentrated each pairs such as c("bad","good") would like to group them and for pairs c('Veni',"vidi","Vici") assign unique number 666 . Here is the example data names <- c(c("bad","good"),1,2,c("good",

R: spread function on data frame with duplicates

Deadly 提交于 2019-12-09 03:43:01
问题 I have a data frame that I need to pivot but the data frame has duplicate identifiers, so spread function gives an error Error: Duplicate identifiers for rows (5, 6) Dimension = c("A","A","B","B","A","A") Date = c("Mon","Tue","Mon","Wed","Fri","Fri") Metric = c(23,25,7,9,7,8) df = data.frame(Dimension,Date,Metric) df Dimension Date Metric 1 A Mon 23 2 A Tue 25 3 B Mon 7 4 B Wed 9 5 A Fri 7 6 A Fri 8 library(tidyr) df1 = spread(df, Date, Metric, fill = " ") Error: Duplicate identifiers for

R: Aggregate character strings with c

三世轮回 提交于 2019-12-09 03:23:36
问题 I have a data frame with two columns: one is strings, the other one is integers. > rnames = sapply(1:20, FUN=function(x) paste("item", x, sep=".")) > x <- sample(c(1:5), 20, replace = TRUE) > df <- data.frame(x, rnames) > df x rnames 1 5 item.1 2 3 item.2 3 5 item.3 4 3 item.4 5 1 item.5 6 3 item.6 7 4 item.7 8 5 item.8 9 4 item.9 10 5 item.10 11 5 item.11 12 2 item.12 13 2 item.13 14 1 item.14 15 3 item.15 16 4 item.16 17 5 item.17 18 4 item.18 19 1 item.19 20 1 item.20 I'm trying to

Pandas GroupBy.agg() throws TypeError: aggregate() missing 1 required positional argument: 'arg'

早过忘川 提交于 2019-12-09 03:16:46
问题 I’m trying to create multiple aggregations of the same field. I’m working in pandas, in python3.7. The syntax seems pretty straightforward based on the documentation: https://pandas-docs.github.io/pandas-docs-travis/user_guide/groupby.html#named-aggregation I do not see why I’m getting the error below. Could someone please point out the issue and tell me how to fix it? code: qt_dy.groupby('date').agg(std_qty=('qty','std'),mean_qty=('qty','mean'),) error: --------------------------------------