series

Python difference in years between a datetime.now() and a Series filled up with dates?

喜欢而已 提交于 2019-12-22 10:43:57
问题 I would like to create a new column in my dataset, which is a difference in years between today and a another column already in the dataset, filled up with dates. the code above: df['diff_years'] = datetime.today() - df['some_date'] df['diff_years'] give me the following output (exemple): 1754 days 11:44:28.971615 and i have to get something like (meaning the output above in years): 4,8 (or 5) I appreciate any help! PS.: i would like to avoid looping the series, path i believe would give me a

Convert MultiIndex DataFrame to Series

放肆的年华 提交于 2019-12-21 20:54:10
问题 I created a multiIndex DataFrame by: df.set_index(['Field1', 'Field2'], inplace=True) If this is not a multiIndex DataFrame please tell me how to make one. I want to: Group by the same columns that are in the index Aggregate a count of each group Then return the whole thing as a Series with Field1 and Field2 as the index How do I go about doing this? ADDITIONAL INFO I have a multiIndex dataFrame that looks like this: Continent Sector Count Asia 1 4 2 1 Australia 1 1 Europe 1 1 2 3 3 2 North

Append to Series in python/pandas not working

邮差的信 提交于 2019-12-21 07:30:30
问题 I am trying to append values to a pandas Series obtained by finding the difference between the nth and nth + 1 element: q = pd.Series([]) while i < len(other array): diff = some int value a = pd.Series([diff], ignore_index=True) q.append(a) i+=1 The output I get is: Series([], dtype: float64) Why am I not getting an array with all the appended values? -- P.S. This is a data science question where I have to find state with the most counties by searching through a dataframe. I am using the

Excel: Duplicated PlotOrder for two Series in a Chart

烂漫一生 提交于 2019-12-20 07:18:09
问题 I have a ChartObject , with 10 Series . I have two Series with .PlotOrder = 1, and two other with .PlotOrder = 2. Therefore, the .PlotOrder of the last Series is 8. Can this be explained? I expected .PlotOrder to span from 1 to .Count . Proof of what I have is that, during execution of a Sub , I get in cho a reference to the ChartObject in question. Then, in the immediate window: ? cho.Chart.SeriesCollection(cho.Chart.SeriesCollection.Count).PlotOrder 8 ? cho.Chart.SeriesCollection.Count 10

The pd.Series.prod() function

巧了我就是萌 提交于 2019-12-20 07:13:30
问题 This should be probably elementary but still I can not figure it out. I am reading the documentation on pd.Series and doing simple exercises. My code is the following: import pandas as pd import numpy as np pd.Series([2, 4, 6]).prod() Out[7]: 48 a = pd.Series(np.arange(1, 100, 3)) a Out[9]: 0 1 1 4 2 7 3 10 4 13 5 16 6 19 7 22 8 25 9 28 10 31 11 34 12 37 13 40 14 43 15 46 16 49 17 52 18 55 19 58 20 61 21 64 22 67 23 70 24 73 25 76 26 79 27 82 28 85 29 88 30 91 31 94 32 97 dtype: int32 a.prod(

How to reverse the order of first and last name in a Pandas Series

拥有回忆 提交于 2019-12-20 04:53:50
问题 I have a pandas series: names = pd.Series([ 'Andre Agassi', 'Barry Bonds', 'Christopher Columbus', 'Daniel Defoe', 'Emilio Estevez', 'Fred Flintstone', 'Greta Garbo', 'Humbert Humbert', 'Ivan Ilych']) Which looks like this: 0 Andre Agassi 1 Barry Bonds 2 Christopher Columbus 3 Daniel Defoe 4 Emilio Estevez 5 Fred Flintstone 6 Greta Garbo 7 Humbert Humbert 8 Ivan Ilych and I want to make it like this: 0 Agassi, Andre 1 Bonds, Barry 2 Columbus, Christopher 3 Defoe, Daniel 4 Estevez, Emilio 5

Conditionally filling blank values in Pandas dataframes

夙愿已清 提交于 2019-12-20 03:03:01
问题 I have a datafarme which looks like as follows (there are more columns having been dropped off): memberID shipping_country 264991 264991 Canada 100 USA 5000 5000 UK I'm trying to fill the blank cells with existing value of shipping country for each user: memberID shipping_country 264991 Canada 264991 Canada 100 USA 5000 UK 5000 UK However, I'm not sure what's the most efficient way to do this on a large scale dataset. Perhaps, using a vectored groupby method? 回答1: You can use chained groupby

Efficient pandas/numpy function for time since change

为君一笑 提交于 2019-12-20 02:47:12
问题 Given a Series , I would like to efficiently compute how many observations have passed since there was a change. Here is a simple example: ser = pd.Series([1.2,1.2,1.2,1.2,2,2,2,4,3]) print(ser) 0 1.2 1 1.2 2 1.2 3 1.2 4 2.0 5 2.0 6 2.0 7 4.0 8 3.0 I would like to apply a function to ser which would result in: 0 0 1 1 2 2 3 3 4 0 5 1 6 2 7 0 8 0 As I am dealing with large series I would prefer a fast solution that does not involve looping. Thanks Edit If possible, would like the function to

Set Pandas column values to an array

跟風遠走 提交于 2019-12-20 02:32:58
问题 I have the following problem: I have a dataframe like this one: col1 col2 col3 0 2 5 4 1 4 3 5 2 6 2 7 Now I have an array for example a = [5,5,5] and i want to insert this array in col3 but only in specific rows (let's say 0 and 2) and obtain something like that: col1 col2 col3 0 2 5 [5,5,5] 1 4 3 5 2 6 2 [5,5,5] The problem is that when I try to do: zip_df.at[[0,2],'col3'] = a I receive the following error ValueError: Must have equal len keys and value when setting with an ndarray . How can

How to do I change a Series color in Excel using C#?

。_饼干妹妹 提交于 2019-12-20 01:46:45
问题 I have written a program in C# whereby it automatically generates a graph for me from a CSV file and puts it onto a new XLS file. However, I need to change the color of the Line (as it is a Line Chart) to red rather than the default blue. I am finding this extremely difficult to do and the stuff I've found online has not worked. Please can someone tell me how to do this? 回答1: Here is an example. I noticed when I tried to pass an integer that the bytes seem to be read in reverse order. So