numpy | 易学教程

Matplotlib: align origin of right axis with specific left axis value

阅读更多关于 Matplotlib: align origin of right axis with specific left axis value

问题 When plotting several y axis in Matplotlib, is there a way to specify how to align the origin (and/or some ytick labels) of the right axis with a specific value of the left axis? Here is my problem: I would like to plot two set of data as well as their difference (basically, I am trying to reproduce this kind of graph). I can reproduce it, but I have to manually adjust the ylim of the right axis so that the origin is aligned with the value I want from the left axis. I putted below an example

Sort matrix based on its diagonal entries

阅读更多关于 Sort matrix based on its diagonal entries

问题 First of all I would like to point out that my question is different than this one: Sort a numpy matrix based on its diagonal The question is as follow: Suppose I have a numpy matrix A= 5 7 8 7 2 9 8 9 3 I would like to sort the matrix based on its diagonal and then re-arrange the matrix element based on it. Such that now sorted_A: 2 9 7 9 3 8 7 8 5 Note that: (1). The diagonal is sorted (2). The other elements (non-diagonal) re-adjusted by it. How? because diag(A)= [5,2,3] & diag(sorted_A)=

Bizarre behaviour of pandas Series.value_counts()

阅读更多关于 Bizarre behaviour of pandas Series.value_counts()

问题 I have a Pandas Series with numerical data and I want to find its unique values together with their frequency-appearance. I use the standard procedure # Given the my_data is a column of a pd.Dataframe df unique = df[my_data].value_counts() print unique And here is the results that I get # -------------------OUTPUT -0.010000 46483 -0.010000 16895 -0.027497 12215 -0.294492 11915 0.027497 11397 What I don't get is why I have the "same value" (-0.01) occurring twice . Is that an internal

Vectorize a 6 for loop cumulative sum in python

阅读更多关于 Vectorize a 6 for loop cumulative sum in python

问题 The mathematical problem is: The expression within the sums is actually much more complex than the one above, but this is for a minimal working example to not over-complicate things. I have written this in Python using 6 nested for loops and as expected it performs very badly (the true form performs badly and needs evaluating millions of times), even with help from Numba, Cython and friends. Here it is written using nested for loops and a cumulative sum: import numpy as np def func1(a,b,c,d):

Finding location in code for numpy RuntimeWarning

阅读更多关于 Finding location in code for numpy RuntimeWarning

问题 I am getting warnings like these when running numpy on reasonably large pipeline. RuntimeWarning: invalid value encountered in true_divide RuntimeWarning: invalid value encountered in greater How do I find where they are occurring in the code besides writing dozens of print statements? Python 2.7 and numpy 1.8.1 回答1: One way is to convert the warnings to errors: import warnings warnings.simplefilter('error', RuntimeWarning) Then the traceback will tell you where the error occurred. 来源： https:

Could not convert string to float error from the Titanic competition

阅读更多关于 Could not convert string to float error from the Titanic competition

问题 I'm trying to solve the Titanic survival program from Kaggle. It's my first step in actually learning Machine Learning. I have a problem where the gender column causes an error. The stacktrace says could not convert string to float: 'female' . How did you guys come across this issue? I don't want solutions. I just want a practical approach to this problem because I do need the gender column to build my model. This is my code: import pandas as pd from sklearn.tree import DecisionTreeRegressor

Conditional formatting for 2- or 3-scale coloring of cells of a table

阅读更多关于 Conditional formatting for 2- or 3-scale coloring of cells of a table

问题 I would like to output a simple table to a PDF file with some conditional formatting of 2- or 3-scale coloring of cells dependent on the value. Like the red-white-green color scaling in Microsoft Excel conditional formatting option. import pandas import numpy as np df = pandas.DataFrame(np.random.randn(10, 2), columns=list('ab')) print df #Output: a b 0 -1.625192 -0.949186 1 -0.089884 0.825922 2 2.117651 -0.046258 3 -0.921751 -0.144447 4 -0.294095 -1.774725 5 -0.780523 -0.435909 6 0.544958 0

Conditional formatting for 2- or 3-scale coloring of cells of a table

阅读更多关于 Conditional formatting for 2- or 3-scale coloring of cells of a table

Conditional formatting for 2- or 3-scale coloring of cells of a table

阅读更多关于 Conditional formatting for 2- or 3-scale coloring of cells of a table

Parallelize loop over numpy rows

阅读更多关于 Parallelize loop over numpy rows

问题 I need to apply the same function onto every row in a numpy array and store the result again in a numpy array. # states will contain results of function applied to a row in array states = np.empty_like(array) for i, ar in enumerate(array): states[i] = function(ar, *args) # do some other stuff on states function does some non trivial filtering of my data and returns an array when the conditions are True and when they are False. function can either be pure python or cython compiled. The