subset | 易学教程

Subsetting a 2D numpy array

阅读更多关于 Subsetting a 2D numpy array

问题 I have looked into documentations and also other questions here, but it seems I have not got the hang of subsetting in numpy arrays yet. I have a numpy array, and for the sake of argument, let it be defined as follows: import numpy as np a = np.arange(100) a.shape = (10,10) # array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], # [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], # [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], # [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], # [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], #

Subset and ggplot2

阅读更多关于 Subset and ggplot2

问题 I have a problem to plot a subset of a data frame with ggplot2. My df is like: ID Value1 Value2 P1 100 12 P1 120 13 ... P2 300 11 P2 400 16 ... P3 130 15 P3 140 12 ... How can I now plot Value1 vs Value2 only for IDs P1 and P3? For example I tried: ggplot(subset(df,ID==\"P1 & P3\") + geom_line(aes(Value1, Value2, group=ID, colour=ID))) but I always receive an error. p.s. I also tried many combination with P1 & P3 but I always failed.. 回答1: Here 2 options for subsetting: Using subset from base

Subset data to contain only columns whose names match a condition

阅读更多关于 Subset data to contain only columns whose names match a condition

问题 Is there a way for me to subset data based on column names starting with a particular string? I have some columns which are like ABC_1 ABC_2 ABC_3 and some like XYZ_1, XYZ_2,XYZ_3 let\'s say. How can I subset my df based only on columns containing the above portions of text (lets say, ABC or XYZ )? I can use indices, but the columns are too scattered in data and it becomes too much of hard coding. Also, I want to only include rows from each of these columns where any of their value is >0 so

Subset a dataframe between 2 dates

阅读更多关于 Subset a dataframe between 2 dates

问题 I am working with daily returns from a Brazilian Index (IBOV) since 1993, I am trying to figure out the best way to subset for periods between 2 dates. The data frame ( IBOV_RET ) is as follows : head(IBOV_RET) DATE 1D_RETURN 1 1993-04-28 -0.008163265 2 1993-04-29 -0.024691358 3 1993-04-30 0.016877637 4 1993-05-03 0.000000000 5 1993-05-04 0.033195021 6 1993-05-05 -0.012048193 ... I set 2 variables DATE1 and DATE2 as dates DATE1 <- as.Date(\"2014-04-01\") DATE2 <- as.Date(\"2014-05-05\") I was

Reading multiple files and calculating mean based on user input

阅读更多关于 Reading multiple files and calculating mean based on user input

问题 I am trying to write a function in R which takes 3 inputs: Directory pollutant id I have a directory on my computer full of CSV\'s files i.e. over 300. What this function would do is shown in the below prototype: pollutantmean <- function(directory, pollutant, id = 1:332) { ## \'directory\' is a character vector of length 1 indicating ## the location of the CSV files ## \'pollutant\' is a character vector of length 1 indicating ## the name of the pollutant for which we will calculate the ##

Filtering a data frame on a vector [duplicate]

阅读更多关于 Filtering a data frame on a vector [duplicate]

问题 This question already has answers here : Filter data.frame rows by a logical condition (8 answers) Closed 3 years ago . I have a data frame df with an ID column eg A , B ,etc. I also have a vector containing certain IDs: L <- c(\"A\", \"B\", \"E\") How can I filter the data frame to get only the IDs present in the vector? Individually, I would use subset(df, ID == \"A\") but how do I filter on a whole vector? 回答1: You can use the %in% operator: > df <- data.frame(id=c(LETTERS, LETTERS), x=1

Extract matrix column values by matrix column name

阅读更多关于 Extract matrix column values by matrix column name

问题 Is it possible to get a matrix column by name from a matrix? I tried various approaches such as myMatrix[\"test\", ] but nothing seems to work. 回答1: Yes. But place your "test" after the comma if you want the column... > A <- matrix(sample(1:12,12,T),ncol=4) > rownames(A) <- letters[1:3] > colnames(A) <- letters[11:14] > A[,"l"] a b c 6 10 1 see also help(Extract) 回答2: > myMatrix <- matrix(1:10, nrow=2) > rownames(myMatrix) <- c("A", "B") > colnames(myMatrix) <- c("A", "B", "C", "D", "E") >

How do I extract a single column from a data.frame as a data.frame? [duplicate]

阅读更多关于 How do I extract a single column from a data.frame as a data.frame? [duplicate]

问题 This question already has an answer here : How to subset matrix to one column, maintain matrix data type, maintain row/column names? (1 answer) Closed 5 years ago . Say I have a data.frame: df <- data.frame(A=c(10,20,30),B=c(11,22,33), C=c(111,222,333)) A B C 1 10 11 111 2 20 22 222 3 30 33 333 If I select two (or more) columns I get a data.frame: x <- df[,1:2] A B 1 10 11 2 20 22 3 30 33 This is what I want. However, if I select only one column I get a numeric vector: x <- df[,1] [1] 1 2 3 I

find all subsets that sum to a particular value

阅读更多关于 find all subsets that sum to a particular value

问题 Given a set of numbers: {1, 3, 2, 5, 4, 9}, find the number of subsets that sum to a particular value (say, 9 for this example). This is similar to subset sum problem with the slight difference that instead of checking if the set has a subset that sums to 9, we have to find the number of such subsets. I am following the solution for subset sum problem here. But and I am wondering how I can modify it to return the count of subsets. 回答1: def total_subsets_matching_sum(numbers, sum): array = [1]

How to replace NA with mean by group / subset?

阅读更多关于 How to replace NA with mean by group / subset?

问题 I have a dataframe with the lengths and widths of various arthropods from the guts of salamanders. Because some guts had thousands of certain prey items, I only measured a subset of each prey type. I now want to replace each unmeasured individual with the mean length and width for that prey. I want to keep the dataframe and just add imputed columns (length2, width2). The main reason is that each row also has columns with data on the date and location the salamander was collected. I could fill