resampling | 易学教程

logarithmically spaced integers

阅读更多关于 logarithmically spaced integers

Say I have a 10,000 pt vector that I want to take a slice of only 100 logarithmically spaced points. I want a function to give me integer values for the indices. Here's a simple solution that is simply using around + logspace, then getting rid of duplicates. def genLogSpace( array_size, num ): lspace = around(logspace(0,log10(array_size),num)).astype(uint64) return array(sorted(set(lspace.tolist())))-1 ls=genLogspace(1e4,100) print ls.size >>84 print ls array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 17, 19, 21, 23, 25, 27, 30, 33, 37, 40, 44, 49, 54, 59, 65, 71, 78, 86, 94, 104, 114

Python Panda TIme series re sampling

阅读更多关于 Python Panda TIme series re sampling

问题 I am writing scripts in panda but i could not able to extract correct output that i want. here it is problem: i can read this data from CSV file. Here you can find table structure http://postimg.org/image/ie0od7ejr/ I want this output from above table data Month Demo1 Demo 2 June 2013 3 1 July 2013 2 2 in Demo1 and Demo2 column i want to count regular entry and entry which starts with u. for June there are total 3 regular entry while 1 entry starts with u. so far i have written this code.

Both fast and very slow scipy.signal.resample with the same input size

阅读更多关于 Both fast and very slow scipy.signal.resample with the same input size

问题 According to the documentation of scipy.signal.resample, the speed should vary according to the length of input : As noted, resample uses FFT transformations, which can be very slow if the number of input samples is large and prime, see scipy.fftpack.fft. But I have very different timings (factor x14) with the same input , and only a small variation of desired output size: import numpy as np, time from scipy.signal import resample x = np.random.rand(262144, 2) y = np.random.rand(262144, 2) t0

Python Panda TIme series re sampling

阅读更多关于 Python Panda TIme series re sampling

I am writing scripts in panda but i could not able to extract correct output that i want. here it is problem: i can read this data from CSV file. Here you can find table structure http://postimg.org/image/ie0od7ejr/ I want this output from above table data Month Demo1 Demo 2 June 2013 3 1 July 2013 2 2 in Demo1 and Demo2 column i want to count regular entry and entry which starts with u. for June there are total 3 regular entry while 1 entry starts with u. so far i have written this code. import sqlite3 from pylab import * import numpy as np import matplotlib.pyplot as plt import matplotlib

Resampling timeseries with a given timedelta

阅读更多关于 Resampling timeseries with a given timedelta

I am using Pandas to structure and process Data. This is my DataFrame: I want to do a resampling of time-series data, and have, for every ID (named here "3"), all bitrate scores, from beginning to end (beginning_time / end_time). For exemple, for the first row, I want to have all seconds, from 2016-07-08 02:17:42 to 2016-07-08 02:17:55, with the same bitrate score, and the same ID of course. Something like this : For example, given : df = pd.DataFrame( {'Id' : ['CODI126640013.ts', 'CODI126622312.ts'], 'beginning_time':['2016-07-08 02:17:42', '2016-07-08 02:05:35'], 'end_time' :['2016-07-08 02

Efficient algorithm for generating unique (non-repeating) random numbers

阅读更多关于 Efficient algorithm for generating unique (non-repeating) random numbers

I want to solve the following problem. I have to sample among an extremely large set, of the order of 10^20 and extracting a sample without repetitions of size about 10%-20%. Given the size of the set, I believe that an algorithm like Fisher–Yates is not feasible. I'm thinking that something like random path tree might work for doing it in O(n log n) and can't be done faster, but I want to ask if something like this has already been implemented. Thank you for your time! I don't know how well the technique I describe below would do on formal tests of randomness, but it does give "random-looking

resampled time using scipy.signal.resample

阅读更多关于 resampled time using scipy.signal.resample

I have a signal that is not sampled equidistant; for further processing it needs to be. I thought that scipy.signal.resample would do it, but I do not understand its behavior. The signal is in y, corresponding time in x. The resampled is expected in yy, with all corresponding time in xx. Does anyone know what I do wrong or how to achieve what I need? This code does not work: xx is not time: import numpy as np from scipy import signal import matplotlib.pyplot as plt x = np.array([0,1,2,3,4,5,6,6.5,7,7.5,8,8.5,9]) y = np.cos(-x**2/4.0) num=50 z=signal.resample(y, num, x, axis=0, window=None) yy

How to specify a validation holdout set to caret

阅读更多关于 How to specify a validation holdout set to caret

I really like using caret for at least the early stages of modeling, especially for it's really easy to use resampling methods. However, I'm working on a model where the training set has a fair number of cases added via semi-supervised self-training and my cross-validation results are really skewed because of it. My solution to this is using a validation set to measure model performance but I can't see a way use a validation set directly within caret - am I missing something or this just not supported? I know that I can write my own wrappers to do what caret would normally do for m, but it

Scipy interpolation how to resize/resample 3x3 matrix to 5x5?

阅读更多关于 Scipy interpolation how to resize/resample 3x3 matrix to 5x5?

EDIT: Paul has solved this one below. Thanks! I'm trying to resample (upscale) a 3x3 matrix to 5x5, filling in the intermediate points with either interpolate.interp2d or interpolate.RectBivariateSpline (or whatever works). If there's a simple, existing function to do this, I'd like to use it, but I haven't found it yet. For example, a function that would work like: # upscale 2x2 to 4x4 matrixSmall = ([[-1,8],[3,5]]) matrixBig = matrixSmall.resample(4,4,cubic) So, if I start with a 3x3 matrix / array: 0,-2,0 -2,11,-2 0,-2,0 I want to compute a new 5x5 matrix ("I" meaning interpolated value): 0

resampled time using scipy.signal.resample

阅读更多关于 resampled time using scipy.signal.resample

问题 I have a signal that is not sampled equidistant; for further processing it needs to be. I thought that scipy.signal.resample would do it, but I do not understand its behavior. The signal is in y, corresponding time in x. The resampled is expected in yy, with all corresponding time in xx. Does anyone know what I do wrong or how to achieve what I need? This code does not work: xx is not time: import numpy as np from scipy import signal import matplotlib.pyplot as plt x = np.array([0,1,2,3,4,5,6