sampling | 易学教程

drawing a stratified sample in R

阅读更多关于 drawing a stratified sample in R

问题 Designing my stratified sample library(survey) design <- svydesign(id=~1,strata=~Category, data=billa, fpc=~fpc) So far so good, but how can I draw now a sample in the same way I was able for simple sampling? set.seed(67359) samplerows <- sort(sample(x=1:N, size=n.pre$n)) 回答1: If you have a stratified design, then I believe you can sample randomly within each stratum. Here is a short algorithm to do proportional sampling in each stratum, using ddply : library(plyr) set.seed(1) dat <- data

Adding offset and delay

阅读更多关于 Adding offset and delay

问题 I have a signal into which I want to introduce several offsets and delays, where offsets range from 0.5 to 5 and delays range from 1 to 7 . I'm providing an example signal here to demonstrate the problem I'm having, but the size of my real data is 1x1666520. How do I introduce these changes to the signal? Example code: t = [ 0 : 1 : 50]; % Time Samples f = 45; % Input Signal Frequency Fs = 440; % Sampling Frequency data = sin(2*pi*f/Fs*t)'; T.InputOffset = 5; T.OutputOffset = 5; addoffset =

Using a sample list as a template for sampling from a larger list without wraparound

阅读更多关于 Using a sample list as a template for sampling from a larger list without wraparound

问题 If I have a vector of letters: > all <- letters > all [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" and then I define a reference sample from letters as follows: > refSample <- c("j","l","m","s") in which the spacing between elements is 2 (1st to 2nd), 1 (2nd to 3rd) and 6 (3rd to 4th), how can I then select n samples from all that have identical, non-wrap-around spacing between its elements to refSample ? For example, "a","c","d",

Sampling without replacement with a different sample size per group in SQL

阅读更多关于 Sampling without replacement with a different sample size per group in SQL

问题 Using the provided table I would like to randomly sample users per day. The number of users to be sampled is specified in the to_sample column and it is filled by another query. In this example I would like to sample 1 observation for day one and 2 observations for day two (but this will change with every execution of the query, so don't set your mind to these numbers). I would like the users assigned to different days to be different (no overlapping assignment). drop table if exists test;

Matlab reading from serial port at specific sampling rate

阅读更多关于 Matlab reading from serial port at specific sampling rate

问题 I am trying to read values from two sensors (on my arduino) that are being sent to the serial port, with the matlab code below. However, it errors saying ??? Attempted to access sensor1(1); index out of bounds because numel(sensor1)=0 and if the error does not occur the results are not accurate. I know this because I simply sent 1 and 2 as the sensor values to the com port and the resulting two arrays contained some zeros too (when one should be all 1's and the other all 2's). Thanks any help

Generating same random variable in Rcpp and R

阅读更多关于 Generating same random variable in Rcpp and R

问题 I am converting my sampling algorithm from R to Rcpp. Output of Rcpp and R are not matching there is some bug in the Rcpp code ( and the difference is not different because of randomization). I am trying to match internal variables of Rcpp with those from R code. However, this is problematic because of randomization due to sampler from distribution. Rcpp::rbinom(1, 1, 10) rbinom(1, 1, 10) How can I make the code give same output in R and Rcpp, I mean setting a common seed from R and Rcpp? 回答1

Implementing Reservoir Sampling using Map Reduce

阅读更多关于 Implementing Reservoir Sampling using Map Reduce

问题 This link "http://had00b.blogspot.com/2013/07/random-subset-in-mapreduce.html" talks about how one can implement reservoir sampling using map reduce framework. I feel their solution is complicated and the following simpler approach would work. Problem: Given very large number of samples, generate a set of size k such that each sample has equal probability of being present in the set. Proposed solution: Map operation: For each input number n, output (i, n) where i is randomly chosen in range 0

stratified sampling with group size below sample size in R

阅读更多关于 stratified sampling with group size below sample size in R

问题 I have response data by market in the format: head(df) ID market q1 q2 470 France 1 3 625 Germany 0 2 155 Italy 1 6 648 Spain 0 5 862 France 1 7 699 Germany 0 8 460 Italy 1 6 333 Spain 1 5 776 Spain 1 4 and the following frequencies: table(df$market) France 140 Germany 300 Italy 50 Spain 75 I need to create a data frame with a sample of 100 responses per market, and all responses without replacement in cases when there's less than 100 of them. so table(df_new$market) France 100 Germany 100

Splitting Dataframe into Confirmatory and Exploratory Samples

阅读更多关于 Splitting Dataframe into Confirmatory and Exploratory Samples

问题 I have a very large dataframe (N = 107,251), that I wish to split into relatively equal halves (~53,625). However, I would like the split to be done such that three variables are kept in equal proportion in the two sets (pertaining to Gender, Age Category with 6 levels, and Region with 5 levels). I can generate the proportions for the variables independently (e.g., via prop.table(xtabs(~dat$Gender)) ) or in combination (e.g., via prop.table(xtabs(~dat$Gender + dat$Region + dat$Age) ), but I'm

Undefined function 'minus' for input argument of type 'iddata'

阅读更多关于 Undefined function 'minus' for input argument of type 'iddata'

问题 This is a followup to a previous issue I was having. I want to give an offset to a signal then add some delay in it and calculate RMSE for that but when taking difference I am having the following issue: I would like to ask the following things: How can I solve the above problem? Will anybody please explain in simple words what iddata does - because I have studied different portals including MATLAB but remained unable to get a good concept. How can I store data of type iddata in cell for