numpy-ndarray | 易学教程

How does slice indexing work in numpy array

阅读更多关于 How does slice indexing work in numpy array

问题 Suppose we have an array a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) Now I have below row_r1 = a[1, :] row_r2 = a[1:2, :] print(row_r1.shape) print(row_r2.shape) I don't understand why row_r1.shape is (4,) and row_r2.shape is (1,4) Shouldn't their shape all equal to (4,)? 回答1: I like to think of it this way. The first way row[1, :] , states go get me all values on row 1 like this: Returning: array([5, 6, 7, 8]) shape (4,) Four values in a numpy array. Where as the second row[1:2, :] ,

How do I concatenate two one-dimensional arrays in NumPy?

阅读更多关于 How do I concatenate two one-dimensional arrays in NumPy?

问题 I have two arrays A = [a1, ..., an] and B = [b1, ..., bn] . I want to get new matrix C that is equal to [[a1, b1], [a2, b2], ... [an, bn]] How can I do it using numpy.concatenate ? 回答1: How about this very simple but fastest solution ? In [73]: a = np.array([0, 1, 2, 3, 4, 5]) In [74]: b = np.array([1, 2, 3, 4, 5, 6]) In [75]: ab = np.array([a, b]) In [76]: c = ab.T In [77]: c Out[77]: array([[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]) But, as Divakar pointed out, using np.column_stack

Converting numpy array into dask dataframe column?

阅读更多关于 Converting numpy array into dask dataframe column?

问题 I have a numpy array that i want to add as a column in a existing dask dataframe. enc = LabelEncoder() nparr = enc.fit_transform(X[['url']]) I have ddf of type dask dataframe. ddf['nurl'] = nparr ??? Any elegant way to achieve above please? Python PANDAS: Converting from pandas/numpy to dask dataframe/array This does not solve my issue as i want numpy array into existing dask dataframe. 回答1: You can convert the numpy array to a dask Series object, then merge it to the dataframe. You will need

Sort Structured Numpy Array On Multiple Columns In Different Order

阅读更多关于 Sort Structured Numpy Array On Multiple Columns In Different Order

问题 I have a structured numpy array: dtype = [('price', float), ('counter', int)] values = [(35, 1), (36, 2), (36, 3)] a = np.array(values, dtype=dtype) I want to sort for price and then for counter if price is equal: a_sorted = np.sort(a, order=['price', 'counter'])[::-1] I need the price in a descending order and when prices are equal consider counter in ASCENDING order. In the example above both the price and the counter are in descending order. What I get is: a_sorted: [(36., 3), (36., 2),

Shuffling and importing few rows of a saved numpy file

阅读更多关于 Shuffling and importing few rows of a saved numpy file

问题 I have 2 saved .npy files: X_train - (18873, 224, 224, 3) - 21.2GB Y_train - (18873,) - 148KB X_train is cats and dogs images (cats being in 1st half and dogs in 2nd half, unshuffled) and is mapped with Y_train as 0 and 1. Thus Y_train is [1,1,1,1,1,1,.........,0,0,0,0,0,0]. I want to import randomly say, 256 images (both cats and dogs images in nearly 50-50%) in X and its mapping in Y. Since the data is large, I cannot import X_train in my RAM. Thus I have tried (1st approach): import numpy

Hessian of Gaussian eigenvalues for 3D image with Python

阅读更多关于 Hessian of Gaussian eigenvalues for 3D image with Python

问题 I have a 3D image and I want to calculate the Hessian of Gaussian eigenvalues for this image. I would like to have the three eigenvalues of the Hessian approximation for each voxel. This feature seems to be something very common in image processing. Is there an existing implementation of this feature (like scipy.ndimage.laplace for laplacian calculation)? And is there one that parallels calculations? I tried to do it manually, through numpy operations but its not optimal because: I have to

How to find all neighbour values near the edge in array?

阅读更多关于 How to find all neighbour values near the edge in array?

问题 I have an array consisting of 0 s and 1 s. Firstly, I need to find all neighbour 1 . I managed to do this (the solution is in the link below). Secondly, I need to choose those, where any element of cluster located near the top boundary. I can find neighbours with code from here. But I need to select only those that are in contact with the top boundary. Here is an example with a 2D array: Input: array([[0, 0, 0, 0, 1, 0, 0, 0, 1, 0], [0, 0, 0, 1, 1, 0, 0, 0, 1, 0], [0, 0, 0, 0, 1, 1, 0, 0, 0,

Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed)

阅读更多关于 Error with numpy array calculations using int dtype (it fails to cast dtype to 64 bit automatically when needed)

问题 I'm encountering a problem with incorrect numpy calculations when the inputs to a calculation are a numpy array with a 32-bit integer data type, but the outputs include larger numbers that require 64-bit representation. Here's a minimal working example: arr = np.ones(5, dtype=int) * (2**24 + 300) # arr.dtype defaults to 'int32' # Following comment from @hpaulj I changed the first line, which was originally: # arr = np.zeros(5, dtype=int) # arr[:] = 2**24 + 300 single_value_calc = 2**8 * (2*

Whats the difference between `arr[tuple(seq)]` and `arr[seq]`? Relating to Using a non-tuple sequence for multidimensional indexing is deprecated

阅读更多关于 Whats the difference between `arr[tuple(seq)]` and `arr[seq]`? Relating to Using a non-tuple sequence for multidimensional indexing is deprecated

问题 I am using an ndarray to slice another ndarray. Normally I use arr[ind_arr] . numpy seems to not like this and raises a FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated use arr[tuple(seq)] instead of arr[seq] . What's the difference between arr[tuple(seq)] and arr[seq] ? Other questions on StackOverflow seem to be running into this error in scipy and pandas and most people suggest the error to be in the particular version of these packages. I am running

Efficiently Calculating a Euclidean Dist Matrix in Numpy?

阅读更多关于 Efficiently Calculating a Euclidean Dist Matrix in Numpy?

问题 I have a large array (~20k entries) of two dimension data, and I want to calculate the pairwise Euclidean distance between all entries. I need the output to have standard square form. Multiple solutions for this problem have been proposed, but none of them seem to work efficiently for large arrays. The method using complex transposing fails for large arrays. Scipy pdist seems to be the most efficient method using numpy. However, using squareform on the result to obtain a square matrix makes