sample

sample rows of subgroups from dataframe with dplyr

三世轮回 提交于 2019-11-27 01:20:14
If I want to randomly select some samples from different groups I use the plyr package and the code below require(plyr) sampleGroup<-function(df,size) { df[sample(nrow(df),size=size),] } iris.sample<-ddply(iris,.(Species),function(df) sampleGroup(df,10)) Here 10 samples are selected from each species. Some of my dataframes are very big and my question is can I use the same sampleGroup function with the dplyr package? Or is there another way to do the same in dplyr? EDIT Version 0.2 of the dplyr package introduced two new functions to select random rows from a table sample_n and sample_frac

Random Sample of a subset of a dataframe in Pandas

南笙酒味 提交于 2019-11-26 21:45:51
问题 Say i have a dataframe with 100,000 entries and want to split it into 100 sections of 1000 entries. How do i take a random sample of say size 50 of just one of the 100 sections. the data set is already ordered such that the first 1000 results are the first section the next section the next and so on. many thanks 回答1: You can use the sample method*: In [11]: df = pd.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8]], columns=["A", "B"]) In [12]: df.sample(2) Out[12]: A B 0 1 2 2 5 6 In [13]: df.sample

How do i open a sample Android SDK app in Eclipse

大城市里の小女人 提交于 2019-11-26 19:34:45
问题 I have created a basic program in Eclipse for Android 2.1. then I wanted to look at some of the samples, and import sample projects which are in the SDK directory. I have tried opening a new project and 'create project from existing source', I browse and can select a project, all the details come up and I can click finish, but I receive an error message saying 'could not write file', and if I click on details, it says "access is denied" I have tried copying a project folder into my own

Sample n random rows per group in a dataframe

ⅰ亾dé卋堺 提交于 2019-11-26 18:49:42
From these questions - Random sample of rows from subset of an R dataframe & Sample random rows in dataframe I can easily see how to randomly sample (select) 'n' rows from a df, or 'n' rows that originate from a specific level of a factor within a df. Here are some sample data: df <- data.frame(matrix(rnorm(80), nrow=40)) df$color <- rep(c("blue", "red", "yellow", "pink"), each=10) df[sample(nrow(df), 3), ] #samples 3 random rows from df, without replacement. To e.g. just sample 3 random rows from 'pink' color - using library(kimisc) : library(kimisc) sample.rows(subset(df, color == "pink"), 3

Needed: A Windows Service That Executes Jobs from a Job Queue in a DB; Wanted: Example Code

痴心易碎 提交于 2019-11-26 17:05:27
问题 Needed: A Windows Service That Executes Jobs from a Job Queue in a DB Wanted: Example Code, Guidance, or Best Practices for this type of Application Background: A user will click on an ashx link that will insert a row into the DB. I need my windows service to periodically poll for rows in this table, and it should execute a unit of work for each row. Emphasis: This isn't completely new terrain for me. EDIT: You can assume that I know how to create a Windows Service and basic data access. But

Choosing n numbers with fixed sum

我们两清 提交于 2019-11-26 14:38:19
In some code I want to choose n random numbers in [0,1) which sum to 1 . I do so by choosing the numbers independently in [0,1) and normalizing them by dividing each one by the total sum: numbers = [random() for i in range(n)] numbers = [n/sum(numbers) for n in numbers] My "problem" is, that the distribution I get out is quite skew. Choosing a million numbers not a single one gets over 1/2 . By some effort I've calculated the pdf, and it's not nice. Here is the weird looking pdf I get for 5 variables: Do you have an idea for a nice algorithm to choose the numbers, that result in a more uniform

Sample http range request session

你。 提交于 2019-11-26 14:03:50
Is it possible to show me a sample http session with range requests. I mean what would be the request and response headers? The following exchange is between Chrome and a static web server, retrieving an MP4 video. Initial request - for the video. Note the Accept-Ranges response header to indicate the server has range header support: GET /BigBuckBunny_320x180.mp4 Cache-Control: max-age=0 Connection: keep-alive Accept-Language: en-GB,en-US,en Host: localhost:8080 Range: Accept: text/html,application/xhtml+xml,application/xml,*/* User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.7 ...

Where can I find sample databases with common formatted data that I can use in multiple database engines? [closed]

主宰稳场 提交于 2019-11-26 12:49:39
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 years ago . Does anybody know of any sample databases I could download, preferably in CSV or some similar easy to import format so that I could get more practice in working with different types of data sets? I know that the Canadian Department of Environment has historical weather data that you can download. However, it\'s

Android sample bluetooth code to send a simple string via bluetooth

家住魔仙堡 提交于 2019-11-26 11:13:13
I want to send a simple string data such as 'a' from an android device to other one via bluetooth. I looked sample bluetooth code in android sdk but it is so complex for me. I cannot understand how I can send only specific data when I press a button. How can I solve this problem? eleven private OutputStream outputStream; private InputStream inStream; private void init() throws IOException { BluetoothAdapter blueAdapter = BluetoothAdapter.getDefaultAdapter(); if (blueAdapter != null) { if (blueAdapter.isEnabled()) { Set<BluetoothDevice> bondedDevices = blueAdapter.getBondedDevices(); if

take randomly sample based on groups

孤街醉人 提交于 2019-11-26 08:27:19
问题 I have a df made by almost 50,000 rows spread in 15 different IDs (every ID has thousands of observations). df looks like: ID Year Temp ph 1 P1 1996 11.3 6.80 2 P1 1996 9.7 6.90 3 P1 1997 9.8 7.10 ... 2000 P2 1997 10.5 6.90 2001 P2 1997 9.9 7.00 2002 P2 1997 10.0 6.93 I want to take 500 random rows for every ID (so 500 for P1, 500 for P2,....) and create a new df. I try: new_df<-df[df$ID %in% sample(unique(dfID),500),] But it takes randomly one ID, while I need 500 random rows for every ID.