pandas | 易学教程

How to calculate time difference between two pandas column [duplicate]

阅读更多关于 How to calculate time difference between two pandas column [duplicate]

问题 This question already has answers here : Calculate Pandas DataFrame Time Difference Between Two Columns in Hours and Minutes (3 answers) Closed 2 years ago . My df looks like, start stop 0 2015-11-04 10:12:00 2015-11-06 06:38:00 1 2015-11-04 10:23:00 2015-11-05 08:30:00 2 2015-11-04 14:01:00 2015-11-17 10:34:00 4 2015-11-19 01:43:00 2015-12-21 09:04:00 print(time_df.dtypes) start datetime64[ns] stop datetime64[ns] dtype: object I am trying to find the time difference between, stop and start.

How to calculate time difference between two pandas column [duplicate]

阅读更多关于 How to calculate time difference between two pandas column [duplicate]

Merging two pandas dataframes on multiple columns

阅读更多关于 Merging two pandas dataframes on multiple columns

问题 I have two dataframes: >>> df1 [Output]: col1 col2 col3 col4 a abc 10 str1 b abc 20 str2 c def 20 str2 d abc 30 str2 >>> df2 [Output]: col1 col2 col3 col5 col6 d abc 30 str6 47 b abc 20 str5 66 c def 20 str7 53 a abc 10 str5 21 Below is what I want to generate: >>> df_merged [Output]: col1 col2 col5 a abc str5 b abc str5 c def str7 d abc str6 I don't want to generate more than 4 rows and that is usually what happens when I try to merge the dataframes. Thanks for the tips! 回答1: Use .merge by

Importing datetimes to pandas DataFrame raises OutOfBoundsDatetime error

阅读更多关于 Importing datetimes to pandas DataFrame raises OutOfBoundsDatetime error

问题 I'm trying to import data to pandas DataFrame, but getting the following error while trying to convert the date_time column to datetime object: pandas.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-19 00:00:00 The format of the column looks like: Jan 19,17 05:04:50 PM My code is: data['Date_Time'] = to_datetime(data['Date_Time']).dt.strftime('%b %d, %y %H:%M:%S ') What is the problem? 回答1: I think you need to_datetime with parameter format: data = pd.DataFrame({'Date_Time

Python: Why is np.where not working with two conditions?

阅读更多关于 Python: Why is np.where not working with two conditions?

问题 I have the following data frame: >>> import pandas as pd >>> import numpy as np >>> df_test = pd.DataFrame({'id': [100, 101, 102, 103, 104], 'drive': ['4WD', None, '4WD', None, '2WD']}) >>> print(df_test) id drive 0 100 4WD 1 101 None 2 102 4WD 3 103 None 4 104 2WD And I would like to make a new column is_4x4 , that would be equal to 0, when drive is None , or drive is 2WD . In other cases, I would like the column to be equal to 1. I am using the following code, but the result is not as I

Python: Why is np.where not working with two conditions?

阅读更多关于 Python: Why is np.where not working with two conditions?

non fixed rolling window

阅读更多关于 non fixed rolling window

问题 I am looking to implement a rolling window on a list, but instead of a fixed length of window, I would like to provide a rolling window list: Something like this: l1 = [5, 3, 8, 2, 10, 12, 13, 15, 22, 28] l2 = [1, 2, 2, 2, 3, 4, 2, 3, 5, 3] get_custom_roling( l1, l2, np.average) and the result would be: [5, 4, 5.5, 5, 6.67, ....] 6.67 is calculated as average of 3 elements 10, 2, 8. I implemented a slow solution, and every idea is welcome to make it quicker :): import numpy as np def get_the

non fixed rolling window

阅读更多关于 non fixed rolling window

non fixed rolling window

阅读更多关于 non fixed rolling window

Specifying colors for multiple lines on plot using matplotlib and pandas [duplicate]

阅读更多关于 Specifying colors for multiple lines on plot using matplotlib and pandas [duplicate]

问题 This question already has an answer here : Matplotlib: change the colors of the result of group by (1 answer) Closed 7 months ago . Pandas dataframe groupby plot I have a similar dataframe to the one in the above question, but it has around 8 ticker symbols. I've defined a list of colours called 'colors' that correspond with the tickers, but when I do: df.groupby('ticker')['adj_close'].plot(color=colors) all the lines on the plot for each of the tickers are the same colour (i.e. the first