numpy-ndarray

How does slice indexing work in numpy array

℡╲_俬逩灬. 提交于 2020-07-30 10:03:32
问题 Suppose we have an array a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) Now I have below row_r1 = a[1, :] row_r2 = a[1:2, :] print(row_r1.shape) print(row_r2.shape) I don't understand why row_r1.shape is (4,) and row_r2.shape is (1,4) Shouldn't their shape all equal to (4,)? 回答1: I like to think of it this way. The first way row[1, :] , states go get me all values on row 1 like this: Returning: array([5, 6, 7, 8]) shape (4,) Four values in a numpy array. Where as the second row[1:2, :] ,

How do I concatenate two one-dimensional arrays in NumPy?

爷,独闯天下 提交于 2020-07-22 21:34:03
问题 I have two arrays A = [a1, ..., an] and B = [b1, ..., bn] . I want to get new matrix C that is equal to [[a1, b1], [a2, b2], ... [an, bn]] How can I do it using numpy.concatenate ? 回答1: How about this very simple but fastest solution ? In [73]: a = np.array([0, 1, 2, 3, 4, 5]) In [74]: b = np.array([1, 2, 3, 4, 5, 6]) In [75]: ab = np.array([a, b]) In [76]: c = ab.T In [77]: c Out[77]: array([[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]) But, as Divakar pointed out, using np.column_stack

Converting numpy array into dask dataframe column?

独自空忆成欢 提交于 2020-07-22 12:27:41
问题 I have a numpy array that i want to add as a column in a existing dask dataframe. enc = LabelEncoder() nparr = enc.fit_transform(X[['url']]) I have ddf of type dask dataframe. ddf['nurl'] = nparr ??? Any elegant way to achieve above please? Python PANDAS: Converting from pandas/numpy to dask dataframe/array This does not solve my issue as i want numpy array into existing dask dataframe. 回答1: You can convert the numpy array to a dask Series object, then merge it to the dataframe. You will need

Sort Structured Numpy Array On Multiple Columns In Different Order

时光毁灭记忆、已成空白 提交于 2020-07-18 22:26:28
问题 I have a structured numpy array: dtype = [('price', float), ('counter', int)] values = [(35, 1), (36, 2), (36, 3)] a = np.array(values, dtype=dtype) I want to sort for price and then for counter if price is equal: a_sorted = np.sort(a, order=['price', 'counter'])[::-1] I need the price in a descending order and when prices are equal consider counter in ASCENDING order. In the example above both the price and the counter are in descending order. What I get is: a_sorted: [(36., 3), (36., 2),

Shuffling and importing few rows of a saved numpy file

别等时光非礼了梦想. 提交于 2020-06-29 04:11:09
问题 I have 2 saved .npy files: X_train - (18873, 224, 224, 3) - 21.2GB Y_train - (18873,) - 148KB X_train is cats and dogs images (cats being in 1st half and dogs in 2nd half, unshuffled) and is mapped with Y_train as 0 and 1. Thus Y_train is [1,1,1,1,1,1,.........,0,0,0,0,0,0]. I want to import randomly say, 256 images (both cats and dogs images in nearly 50-50%) in X and its mapping in Y. Since the data is large, I cannot import X_train in my RAM. Thus I have tried (1st approach): import numpy

Hessian of Gaussian eigenvalues for 3D image with Python

自闭症网瘾萝莉.ら 提交于 2020-06-29 03:38:27
问题 I have a 3D image and I want to calculate the Hessian of Gaussian eigenvalues for this image. I would like to have the three eigenvalues of the Hessian approximation for each voxel. This feature seems to be something very common in image processing. Is there an existing implementation of this feature (like scipy.ndimage.laplace for laplacian calculation)? And is there one that parallels calculations? I tried to do it manually, through numpy operations but its not optimal because: I have to

How to find all neighbour values near the edge in array?

别等时光非礼了梦想. 提交于 2020-06-27 04:02:08
问题 I have an array consisting of 0 s and 1 s. Firstly, I need to find all neighbour 1 . I managed to do this (the solution is in the link below). Secondly, I need to choose those, where any element of cluster located near the top boundary. I can find neighbours with code from here. But I need to select only those that are in contact with the top boundary. Here is an example with a 2D array: Input: array([[0, 0, 0, 0, 1, 0, 0, 0, 1, 0], [0, 0, 0, 1, 1, 0, 0, 0, 1, 0], [0, 0, 0, 0, 1, 1, 0, 0, 0,

Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed)

烈酒焚心 提交于 2020-06-25 03:43:06
问题 I'm encountering a problem with incorrect numpy calculations when the inputs to a calculation are a numpy array with a 32-bit integer data type, but the outputs include larger numbers that require 64-bit representation. Here's a minimal working example: arr = np.ones(5, dtype=int) * (2**24 + 300) # arr.dtype defaults to 'int32' # Following comment from @hpaulj I changed the first line, which was originally: # arr = np.zeros(5, dtype=int) # arr[:] = 2**24 + 300 single_value_calc = 2**8 * (2*

Whats the difference between `arr[tuple(seq)]` and `arr[seq]`? Relating to Using a non-tuple sequence for multidimensional indexing is deprecated

大城市里の小女人 提交于 2020-06-16 06:46:51
问题 I am using an ndarray to slice another ndarray. Normally I use arr[ind_arr] . numpy seems to not like this and raises a FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated use arr[tuple(seq)] instead of arr[seq] . What's the difference between arr[tuple(seq)] and arr[seq] ? Other questions on StackOverflow seem to be running into this error in scipy and pandas and most people suggest the error to be in the particular version of these packages. I am running

Efficiently Calculating a Euclidean Dist Matrix in Numpy?

寵の児 提交于 2020-05-30 08:13:01
问题 I have a large array (~20k entries) of two dimension data, and I want to calculate the pairwise Euclidean distance between all entries. I need the output to have standard square form. Multiple solutions for this problem have been proposed, but none of them seem to work efficiently for large arrays. The method using complex transposing fails for large arrays. Scipy pdist seems to be the most efficient method using numpy. However, using squareform on the result to obtain a square matrix makes