numpy

How to perform a merge of (too) large dataframes?

痴心易碎 提交于 2021-01-29 06:43:41
问题 I'm trying to merge couple of dataframes from HomeCredit Kaggle competion according to the data schema. I did following: train = pd.read_csv('~/Documents/HomeCredit/application_train.csv') bureau = pd.read_csv('~/Documents/HomeCredit/bureau.csv') bureau_balance = pd.read_csv('~/Documents/HomeCredit/bureau_balance.csv') train = train.merge(bureau,how='outer',left_on=['SK_ID_CURR'],right_on=['SK_ID_CURR']) train = train.merge(bureau_balance,how='inner',left_on=['SK_ID_BUREAU'],right_on=['SK_ID

How to get a sigmodal CDF curve use scipy.stats.norm.cdf and matplotlib?

孤者浪人 提交于 2021-01-29 06:32:28
问题 I am trying to plot the S-shape cumulative distribution function (cdf) curve of a normal distribution. However, I ended up with a uniform distribution. What am I doing wrong? Test Script import numpy as np from numpy.random import default_rng from scipy.stats import norm import matplotlib.pyplot as plt siz = 1000 rg = default_rng( 12345 ) a = rg.random(size=siz) rg = default_rng( 12345 ) b = norm.rvs(size=siz, random_state=rg) c = norm.cdf(b) print( 'a = ', a) print( 'b = ', b) print( 'c = ',

Convert list of strings to numpy array of floats

随声附和 提交于 2021-01-29 06:24:19
问题 Assume I have a list of strings and I want to convert it to the numpy array. For example I have A=A=['[1 2 3 4 5 6 7]','[8 9 10 11 12 13 14]'] print(A) ['[1 2 3 4 5 6 7]', '[8 9 10 11 12 13 14]'] I want my output to be like the following : a matrix of 2 by 7 [1 2 3 4 5 6 7;8 9 10 11 12 13 14] What I have tried thus far is the following: m=len(A) M=[] for ii in range(m): temp=A[ii] temp=temp.strip('[') temp=temp.strip(']') M.append(temp) print(np.asarray(M)) however my output is the following:

Error when checking target: expected dense_1 to have shape (1,) but got array with shape (256,)

不羁岁月 提交于 2021-01-29 06:20:31
问题 I am trying to learn tensorflow, and I was following a demo tutorial (https://www.tensorflow.org/tutorials/keras/basic_text_classification) The error report is telling me "Error when checking target: expected dense_1 to have shape (1,) but got array with shape (256,)" Can someone explain to me why this won't work? train_data = keras.preprocessing.sequence.pad_sequences(train_data, value=word_index["<PAD>"], padding='post', maxlen=256) #max length test_data = keras.preprocessing.sequence.pad

Constrain numpy to automatically convert integers to floating-point numbers (python 3.7)

感情迁移 提交于 2021-01-29 06:17:29
问题 I have just made the following mistake: a = np.array([0,3,2, 1]) a[0] = .001 I was expecting 0 to be replaced by .001 (and the dtype of my numpy array to automatically switch from int to float). However, print (a) returns: array([0, 3, 2, 1]) Can somebody explain why numpy is doing that? I am confused because multiplying my array of integers by a floating point number will automatically change dtype to float: b = a*.1 print (b) array([0. , 0.3, 0.2, 0.1]) Is there a way to constrain numpy to

generate combinations of arrays

落花浮王杯 提交于 2021-01-29 06:08:56
问题 I have 9 arrays that I want to manipulate to find all possible combinations, so that the name of the resulting array tells me which arrays have been combined. For example: a1_a2 = array1 - array2 a1_a3 = array1 - array3 a1_a4 = array1 - array4 . . . a9_a6 = array9 - array6 a9_a7 = array9 - array7 a9_a8 = array9 - array8 Obviously I could hardcode it, but how could I do it in a loop? I thought of writing a function for it, something like: def combineArrays(array1, array2): result_name = name

Numpy Inheritance; add a method to Numpy Array

两盒软妹~` 提交于 2021-01-29 06:04:06
问题 Let's say we have a 2D array, image , eg. 20x20. I would like add a method, called 'imshow' to this object such that whenever I do image.imshow(**kwargs) ), the method imshow will make a call of Matplotlib.pyplot.imshow What is the best way to do this? I was thinking of writing a class with an inheritance from numpy.ndarray, and adding a method 'imshow'. 回答1: Just found the answer (thanks to How can a class that inherits from a NumPy array change its own values?)! class array(np.ndarray): def

How to efficiently split scipy sparse and numpy arrays into smaller N unequal chunks?

隐身守侯 提交于 2021-01-29 06:01:16
问题 After checking the documentation and this question I tried to split a numpy array and a sparse scipy matrices as follows: >>>print(X.shape) (2399, 39999) >>>print(type(X)) <class 'scipy.sparse.csr.csr_matrix'> >>>print(X.toarray()) [[0 0 0 ..., 0 0 0] [0 0 0 ..., 0 0 0] [0 0 0 ..., 0 0 0] ..., [0 0 0 ..., 0 0 0] [0 0 0 ..., 0 0 0] [0 0 0 ..., 0 0 0]] Then: new_array = np.split(X,3) Out: ValueError: array split does not result in an equal division Then I tried to: new_array = np.hsplit(X,3)

importing numpy package in Spyder, Python

醉酒当歌 提交于 2021-01-29 05:52:35
问题 I'm starting to learn python. So the first thing I did was install python, in this case ubuntu 16.04 LTS, since in this system python is already installed. As a first test, I tried to run a simple program with numpy library, but program n does not run, it displays the following error: import sitecustomize' failed; use -v for traceback <type 'numpy.ndarray'> It seems to me that Spyder failed to import the numpy library, although I've already called it by the terminal and it came, that is, it's

importing numpy package in Spyder, Python

删除回忆录丶 提交于 2021-01-29 05:48:38
问题 I'm starting to learn python. So the first thing I did was install python, in this case ubuntu 16.04 LTS, since in this system python is already installed. As a first test, I tried to run a simple program with numpy library, but program n does not run, it displays the following error: import sitecustomize' failed; use -v for traceback <type 'numpy.ndarray'> It seems to me that Spyder failed to import the numpy library, although I've already called it by the terminal and it came, that is, it's