subset

Subsetting a 2D numpy array

匆匆过客 提交于 2019-11-26 06:46:01
问题 I have looked into documentations and also other questions here, but it seems I have not got the hang of subsetting in numpy arrays yet. I have a numpy array, and for the sake of argument, let it be defined as follows: import numpy as np a = np.arange(100) a.shape = (10,10) # array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], # [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], # [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], # [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], # [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], #

Subset and ggplot2

南楼画角 提交于 2019-11-26 06:36:12
问题 I have a problem to plot a subset of a data frame with ggplot2. My df is like: ID Value1 Value2 P1 100 12 P1 120 13 ... P2 300 11 P2 400 16 ... P3 130 15 P3 140 12 ... How can I now plot Value1 vs Value2 only for IDs P1 and P3? For example I tried: ggplot(subset(df,ID==\"P1 & P3\") + geom_line(aes(Value1, Value2, group=ID, colour=ID))) but I always receive an error. p.s. I also tried many combination with P1 & P3 but I always failed.. 回答1: Here 2 options for subsetting: Using subset from base

Subset data to contain only columns whose names match a condition

落爺英雄遲暮 提交于 2019-11-26 05:55:51
问题 Is there a way for me to subset data based on column names starting with a particular string? I have some columns which are like ABC_1 ABC_2 ABC_3 and some like XYZ_1, XYZ_2,XYZ_3 let\'s say. How can I subset my df based only on columns containing the above portions of text (lets say, ABC or XYZ )? I can use indices, but the columns are too scattered in data and it becomes too much of hard coding. Also, I want to only include rows from each of these columns where any of their value is >0 so

Subset a dataframe between 2 dates

痞子三分冷 提交于 2019-11-26 05:32:39
问题 I am working with daily returns from a Brazilian Index (IBOV) since 1993, I am trying to figure out the best way to subset for periods between 2 dates. The data frame ( IBOV_RET ) is as follows : head(IBOV_RET) DATE 1D_RETURN 1 1993-04-28 -0.008163265 2 1993-04-29 -0.024691358 3 1993-04-30 0.016877637 4 1993-05-03 0.000000000 5 1993-05-04 0.033195021 6 1993-05-05 -0.012048193 ... I set 2 variables DATE1 and DATE2 as dates DATE1 <- as.Date(\"2014-04-01\") DATE2 <- as.Date(\"2014-05-05\") I was

Reading multiple files and calculating mean based on user input

只愿长相守 提交于 2019-11-26 04:51:14
问题 I am trying to write a function in R which takes 3 inputs: Directory pollutant id I have a directory on my computer full of CSV\'s files i.e. over 300. What this function would do is shown in the below prototype: pollutantmean <- function(directory, pollutant, id = 1:332) { ## \'directory\' is a character vector of length 1 indicating ## the location of the CSV files ## \'pollutant\' is a character vector of length 1 indicating ## the name of the pollutant for which we will calculate the ##

Filtering a data frame on a vector [duplicate]

不想你离开。 提交于 2019-11-26 04:50:26
问题 This question already has answers here : Filter data.frame rows by a logical condition (8 answers) Closed 3 years ago . I have a data frame df with an ID column eg A , B ,etc. I also have a vector containing certain IDs: L <- c(\"A\", \"B\", \"E\") How can I filter the data frame to get only the IDs present in the vector? Individually, I would use subset(df, ID == \"A\") but how do I filter on a whole vector? 回答1: You can use the %in% operator: > df <- data.frame(id=c(LETTERS, LETTERS), x=1

Extract matrix column values by matrix column name

。_饼干妹妹 提交于 2019-11-26 04:48:08
问题 Is it possible to get a matrix column by name from a matrix? I tried various approaches such as myMatrix[\"test\", ] but nothing seems to work. 回答1: Yes. But place your "test" after the comma if you want the column... > A <- matrix(sample(1:12,12,T),ncol=4) > rownames(A) <- letters[1:3] > colnames(A) <- letters[11:14] > A[,"l"] a b c 6 10 1 see also help(Extract) 回答2: > myMatrix <- matrix(1:10, nrow=2) > rownames(myMatrix) <- c("A", "B") > colnames(myMatrix) <- c("A", "B", "C", "D", "E") >

How do I extract a single column from a data.frame as a data.frame? [duplicate]

╄→尐↘猪︶ㄣ 提交于 2019-11-26 04:26:05
问题 This question already has an answer here : How to subset matrix to one column, maintain matrix data type, maintain row/column names? (1 answer) Closed 5 years ago . Say I have a data.frame: df <- data.frame(A=c(10,20,30),B=c(11,22,33), C=c(111,222,333)) A B C 1 10 11 111 2 20 22 222 3 30 33 333 If I select two (or more) columns I get a data.frame: x <- df[,1:2] A B 1 10 11 2 20 22 3 30 33 This is what I want. However, if I select only one column I get a numeric vector: x <- df[,1] [1] 1 2 3 I

find all subsets that sum to a particular value

倾然丶 夕夏残阳落幕 提交于 2019-11-26 03:49:18
问题 Given a set of numbers: {1, 3, 2, 5, 4, 9}, find the number of subsets that sum to a particular value (say, 9 for this example). This is similar to subset sum problem with the slight difference that instead of checking if the set has a subset that sums to 9, we have to find the number of such subsets. I am following the solution for subset sum problem here. But and I am wondering how I can modify it to return the count of subsets. 回答1: def total_subsets_matching_sum(numbers, sum): array = [1]

How to replace NA with mean by group / subset?

浪子不回头ぞ 提交于 2019-11-26 02:35:51
问题 I have a dataframe with the lengths and widths of various arthropods from the guts of salamanders. Because some guts had thousands of certain prey items, I only measured a subset of each prey type. I now want to replace each unmeasured individual with the mean length and width for that prey. I want to keep the dataframe and just add imputed columns (length2, width2). The main reason is that each row also has columns with data on the date and location the salamander was collected. I could fill