series

Pandas: change data type of Series to String

≡放荡痞女 提交于 2019-11-26 20:05:05
I use Pandas 'ver 0.12.0' with Python 2.7 and have a dataframe as below: df = pd.DataFrame({'id' : [123,512,'zhub1', 12354.3, 129, 753, 295, 610], 'colour': ['black', 'white','white','white', 'black', 'black', 'white', 'white'], 'shape': ['round', 'triangular', 'triangular','triangular','square', 'triangular','round','triangular'] }, columns= ['id','colour', 'shape']) The id Series consists of some integers and strings. Its dtype by default is object . I want to convert all contents of id to strings. I tried astype(str) , which produces the output below. df['id'].astype(str) 0 1 1 5 2 z 3 1 4

Python: Pandas Series - Why use loc?

随声附和 提交于 2019-11-26 18:27:55
Why do we use 'loc' for pandas dataframes? it seems the following code with or without using loc both compile anr run at a simulular speed %timeit df_user1 = df.loc[df.user_id=='5561'] 100 loops, best of 3: 11.9 ms per loop or %timeit df_user1_noloc = df[df.user_id=='5561'] 100 loops, best of 3: 12 ms per loop So why use loc? Edit: This has been flagged as a duplicate question. But although pandas iloc vs ix vs loc explanation? does mention that * you can do column retrieval just by using the data frame's getitem : * df['time'] # equivalent to df.loc[:, 'time'] it does not say why we use loc,

Print series of prime numbers in python

心不动则不痛 提交于 2019-11-26 16:26:41
I am trying to learn Python programming, and I'm pretty new at this. I was having issues in printing a series of prime numbers from one to hundred. I can't figure our what's wrong with my code. Here's what I wrote; it prints all the odd numbers instead of primes: for num in range(1,101): for i in range(2,num): if (num%i==0): break else: print(num) break Igor Chubin You need to check all numbers from 2 to n-1 (to sqrt(n) actually, but ok, let it be n). If n is divisible by any of the numbers, it is not prime. If a number is prime, print it. for num in range(2,101): prime = True for i in range(2

Remove NaN from pandas series

别来无恙 提交于 2019-11-26 15:56:56
问题 Is there a way to remove a NaN values from a panda series? I have a series that may or may not have some NaN values in it, and I'd like to return a copy of the series with all the NaNs removed. 回答1: >>> s = pd.Series([1,2,3,4,np.NaN,5,np.NaN]) >>> s[~s.isnull()] 0 1 1 2 2 3 3 4 5 5 update or even better approach as @DSM suggested in comments, using pandas.Series.dropna(): >>> s.dropna() 0 1 1 2 2 3 3 4 5 5 回答2: A small usage of np.nan ! = np.nan s[s==s] Out[953]: 0 1.0 1 2.0 2 3.0 3 4.0 5 5.0

Convert pandas data frame to series

馋奶兔 提交于 2019-11-26 15:30:47
问题 I'm somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns. I want to convert this into a series? I'm wondering what the most pythonic way to do this is? I've tried pd.Series(myResults) but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1 . It's not smart enough to realize it's still a "vector" in math terms. Thanks! 回答1: It's not smart enough to realize it's still a "vector" in math terms. Say rather that it's smart enough to

JFreechart series tool tip above shape annotation

旧城冷巷雨未停 提交于 2019-11-26 13:51:05
I have an XYPlot on which are series and a couple of dynamically added shape annotations with no fill (hence each of the series points are visible). Is it possible to display the series tool tips(that show the coordinate of the series point over which the mouse pointer is currently pointing to) over the annotations? Or how can I re-arrange the elements in order to make the tooltip visible. I suspect you are adding the shape annotations to the plot, where they are drawn last. Instead, add them to the renderer in Layer.BACKGROUND . As shown below, the circle does not obscure the tool tip at (20,

Sorting a pandas series

时光怂恿深爱的人放手 提交于 2019-11-26 12:42:38
问题 I am trying to figure out how to sort the Series generated as a result of a groupby aggregation in a smart way. I generate an aggregation of my DataFrame like this: means = df.testColumn.groupby(df.testCategory).mean() This results in a Series. I now try to sort this by value, but get an error: means.sort() ... -> Exception: This Series is a view of some other array, to sort in-place you must create a copy I then try creating a copy: meansCopy = Series(means) meansCopy.sort() -> Exception:

Pandas selecting by label sometimes return Series, sometimes returns DataFrame

*爱你&永不变心* 提交于 2019-11-26 12:37:21
问题 In Pandas, when I select a label that only has one entry in the index I get back a Series, but when I select an entry that has more then one entry I get back a data frame. Why is that? Is there a way to ensure I always get back a data frame? In [1]: import pandas as pd In [2]: df = pd.DataFrame(data=range(5), index=[1, 2, 3, 3, 3]) In [3]: type(df.loc[3]) Out[3]: pandas.core.frame.DataFrame In [4]: type(df.loc[1]) Out[4]: pandas.core.series.Series 回答1: Granted that the behavior is

Pandas: change data type of Series to String

穿精又带淫゛_ 提交于 2019-11-26 08:59:56
问题 I use Pandas \'ver 0.12.0\' with Python 2.7 and have a dataframe as below: df = pd.DataFrame({\'id\' : [123,512,\'zhub1\', 12354.3, 129, 753, 295, 610], \'colour\': [\'black\', \'white\',\'white\',\'white\', \'black\', \'black\', \'white\', \'white\'], \'shape\': [\'round\', \'triangular\', \'triangular\',\'triangular\',\'square\', \'triangular\',\'round\',\'triangular\'] }, columns= [\'id\',\'colour\', \'shape\']) The id Series consists of some integers and strings. Its dtype by default is

Pandas pd.Series.isin performance with set versus array

人盡茶涼 提交于 2019-11-26 08:11:18
问题 In Python generally, membership of a hashable collection is best tested via set . We know this because the use of hashing gives us O(1) lookup complexity versus O(n) for list or np.ndarray . In Pandas, I often have to check for membership in very large collections. I presumed that the same would apply, i.e. checking each item of a series for membership in a set is more efficient than using list or np.ndarray . However, this doesn\'t seem to be the case: import numpy as np import pandas as pd