group-by | 易学教程

How can I count the number of grouped pairs in which one row's column value is greater than another?

阅读更多关于 How can I count the number of grouped pairs in which one row's column value is greater than another?

问题 I have a dataset (df1) with a number of paired values. One row of the pair is for one year (e.g., 2014), the other for a different year (e.g., 2013). For each pair is a value in the column G. I need a count of the number of pairs in which the G value for the higher year is less than the G value for the lesser year. Here is my dput for the dataset df1: structure(list(Name = c("A.J. Ellis", "A.J. Ellis", "A.J. Pierzynski", "A.J. Pierzynski", "Aaron Boone", "Adam Kennedy", "Adam Melhuse",

Get aggregated average values joining three tables and display them next to each value in first table

阅读更多关于 Get aggregated average values joining three tables and display them next to each value in first table

问题 I have three tables which you can also find in the SQL fiddle: CREATE TABLE Sales ( Product_ID VARCHAR(255), Sales_Value VARCHAR(255), Sales_Quantity VARCHAR(255) ); INSERT INTO Sales (Product_ID, Sales_Value, Sales_Quantity) VALUES ("P001", "500", "200"), ("P002", "600", "100"), ("P003", "300", "250"), ("P004", "900", "400"), ("P005", "800", "600"), ("P006", "200", "150"), ("P007", "700", "550"); CREATE TABLE Products ( Product_ID VARCHAR(255), Product_Name VARCHAR(255), Category_ID VARCHAR

Filtering an aggregated chart with another aggregation field

阅读更多关于 Filtering an aggregated chart with another aggregation field

问题 I'm trying to produce something similar to the K-top example. Except that instead of filtering out and displaying the same aggregated field data, I want: to display one type of aggregated data (the max of daily temps) and filter on another aggregation field ( the mean of daily temps) I've created an observable notebook here to build my test case, and this is how far I got. { "$schema": "https://vega.github.io/schema/vega-lite/v4.json", "data": {"url": "data/seattle-weather.csv"}, "transform":

Filtering an aggregated chart with another aggregation field

阅读更多关于 Filtering an aggregated chart with another aggregation field

How to properly GROUP BY in MySQL?

阅读更多关于 How to properly GROUP BY in MySQL?

问题 I have the following (intentionally denormalized for demonstrating purposes) sample CARS table: | CAR_ID | OWNER_ID | OWNER_NAME | COLOR | |--------|----------|------------|-------| | 1 | 1 | John | White | | 2 | 1 | John | Black | | 3 | 2 | Mike | White | | 4 | 2 | Mike | Black | | 5 | 2 | Mike | Brown | | 6 | 3 | Tony | White | If I wanted to count the amount of cars per owner and return this: | OWNER_ID | OWNER_NAME | TOTAL | |----------|------------|-------| | 1 | John | 2 | | 2 | Mike |

Select multiple aggregates of a joined table on Postgres

阅读更多关于 Select multiple aggregates of a joined table on Postgres

How to sum negative and positive values separately when using groupby in pandas?

阅读更多关于 How to sum negative and positive values separately when using groupby in pandas?

问题 How to sum positive and negative values differently in pandas and put them let's say in positive and negative columns? I have this dataframe like below: df = pandas.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'], 'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'], 'C' : np.random.randn(8), 'D' : np.random.randn(8)}) Output is as below: df A B C D 0 foo one 0.374156 0.319699 1 bar one -0.356339 -0.629649 2 foo two -0.390243 -1.387909 3 bar three -0

How to sum negative and positive values separately when using groupby in pandas?

阅读更多关于 How to sum negative and positive values separately when using groupby in pandas?

Pandas monthly rolling window

阅读更多关于 Pandas monthly rolling window

问题 I am looking to do a 'monthly' rolling window on daily data grouped by a category. The code below does not work as is, it leads to the following error: ValueError: <DateOffset: months=1> is a non-fixed frequency I know that I could use '30D' offset, however this would shift the date over time. I'm looking for the sum of a window that spans from the x-th day of a month to that same x-th day of the J-th month. E.g. with J=1: 4th of July to 4th of August, 5th of July to 5th of August, 6th of

Using dplyr to group_by and conditionally mutate a dataframe by group

阅读更多关于 Using dplyr to group_by and conditionally mutate a dataframe by group

问题 I'd like to use dplyr functions to group_by and conditionally mutate a df. Given this sample data: A B C D 1 1 1 0.25 1 1 2 0 1 2 1 0.5 1 2 2 0 1 3 1 0.75 1 3 2 0.25 2 1 1 0 2 1 2 0.5 2 2 1 0 2 2 2 0 2 3 1 0 2 3 2 0 3 1 1 0.5 3 1 2 0 3 2 1 0.25 3 2 2 1 3 3 1 0 3 3 2 0.75 I want to use new column E to categorize A by whether B == 1, C == 2, and D > 0. For each unique value of A for which all of these conditions hold true, then E = 1, else E = 0. So, the output should look like this: A B C D E